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sjssess^il^r cJp-crrcr'p vnt nKf jrvC ;rt?^c»*-K:t or £ lesr. ^-r rose s^frrn nr^o carro: 
<iirrcfns:rz!i£ rro*ici£nr^ r^ojrts: r appiicsto-^ c»* CRT tfcn^Quss has bee- 

:>nD9arT' tx u p-cceo jres to fip:»s nj C^^T -leo^r and to Pi-aiue^ rit ariKjjari r% 

CRT 3pp^oa2^ r e v-antrf cr^ trannj v^kioti Rrttted eMcrts ift ^? Tecrrnsa A-CB rcUid?' 

C>^xce:ii*^ to pffriormana-Dassfd tre'ruig ,r tank gunner,- TtDDCf. ^pe-imentj to 
corroare ne asar&:r» ci* sevtra CRT rroiet; r. f -111-19 ernpncat dra (WETTEST), and tne 
s^TteTiatK dev^comers: erf trainng and testing obj&r:u'« to tank gunnerif lUVEf IRE) 



Tf>f$ pubiisatpr o-jrin« me ^gtonaie to taing i^ire CRT aacroari and ajgpesw specfic 
gjicfelmes *or cJe^reicoerx m ccrricractinj »^ .t*?ni Wieth:>dts to assessing the B6i^ja^ 
c?f a CRT are aiso pro»'»d*rd 



ARI research jr> rttx area rs cdndi^cafd ej an tr/^ass efftyi a^jgment&d contacts vv;^ 
or^ntrstions selected as hai-tng unoye cap^";t«^ and fcoffttes to r«seaxh tn a specific 3^ 
Tne present roidr waj cpndjc^ by ^^r^^r^i of tSe Army Research Jnsiftifts and AppjW 
Sciences Assoc^s^. ind . ,r*dref Cc^Sftf^>d?nbef DAHC t9^74-(>oaiaL and «s responsive ^ 
requirements of RDTE r>ro^ 2Q154^f5A757. Tray^jng Systern Applications. FY 74 




J^^HLANEH, 
jfe^ntcal Director 
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• Tbij^ Jienual ijs int^ded fpr use by persons Involved t^stiog. You 
will find tJtiis nSnv^l asefy^ if yaur i^rari involves any pf^e t)^4^t c6T3- 
stmctSjpn, test adniinlstratiois^ on use of t^t results. ^iC^feen j^a ar^ 
ifn^Dtyed with just a sirall s^mept or test cxmstracticn and user^G^^ 
writing a few t^t items or belpf^g administer perfcrTtiznca tests ^ a 
field-st3ticn~er Whether yoj Si2)eririse an ^tire test c6istrvcviok or 
test administration process, you will fin^ helpfiil guidance in thIsXnanual 





ion. 



This manual is a carefully researca^ pr^sentatio?^ of «hat is knOiv'n 
about Criteriqn-Jiefer^^d ICR) testing, written in a "fjow-to^dD-it* 
Examples used in this iianual to .illustrate points are dr^wn from tS^e 
Hence of Arn^y test personnel wording in"^ variety An^KSit^H^tions. 
AltfcDwgh test construction and use requi renients differ in varioife Amy 
facilities, this jnanyal has, been tailored to be as'taseful to you as possfole, 
ao. natter vAat parti cul^or "processes are used to develop and administer te^ . 
at your location. Consequently, vAile this itanual does, present, an overall \ 




without violjating the overall K3y in nfeich you develop tests- Of coirrse, 
if you* foV?ow the -overall process presented in this inanual, you can.l>e itore 
certain^&at you vilT develop tests that ^11 insasure ^at you want them to 
jneasj 

Khile there are certain technical questions involving dTT a^nstruction 
on vfhich testing experts, fail to agree, "there is basic asreeme^t on mny mjor 
elenjents: So, if you are presently involved wit* test developin^t and use, 
you ^11, find in this manual guidelines tiiat can f^8lp*you in perforaing your 
partit:ular testioO^*^> steer you around problems, and help ensure that 
your tests work as well as possible. 

The g nphasis in this jnanual is on test development. If you are invoTved ] 
only in tifte administratioivoffests* you. niight xant to read ju^t Chapt^ 6 
^idr covers adminis.terin^smd scoring of tests. If you are involved in^ 
^niy a snail segment of an overall test construction effort, or If you h#ye' 
a problem with ? specific aspect of test d^elopqent, you nay just want to . 
consuH the relevant section of ihis canuaT. Re«r to the table of. contents 
to find the appropriate reading to aid you. • % 
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Aft^yoj are famlia^ yfith ffe test constni^ion , eProcesses contained 
in trills 3afiual,^j iraj^ iris^i to ^e s cJ^ecklist to.girfda yea in ^^d:^ test 
.deve^opcasni^actayities^ loe Checklist for Ccmstn>cting* CTTs »nta1f3ed lo 
appendix A of this iiarml >ril1 help assure Siat all iteps w t^je test 
^nstmction process are «nrered adeq^Htely. ^ ^ ' 

, If you 




^^y&3 irajr- wan 

evaluatien. Yoii my also-jraint ttf ase tJiis ctiedkl^st (te a culde for ravi&nna 
tests" ysii bjil^d ori-or ts fonnal tryo:it. This checklist appears in ^endix 

The foUendng feat^jr^s ^^elp^iialfe thts ^ranim easy to use: 

P^vi^ questions and answers for ea^ch^pter (in ^>pef)t3ix £) 
wl\ help you to suppternent your^isfSj of anderstai^ding for 
. ^ tSat chapter. ^ . - ' ^ 

• Pages are ntciiber^ irithin chapters. 

• Cfe^ter^have f1oK»charts when necessary to shovi'tfce ; . 
sequence of operatloi^ requiT^ for completing {HT devel- 

opjt)en$, tasJjs.. The flowcharts fold out so you can ref$r to" 
t3ienj as you read t^ie^text- By yslig 4|3ese fibwcjjarts, you , ' 
can seec Just where j^u ar« in tJae (HT develcfpment process! . 

• Kajor points are surroimded by boxes ^ and other points * ^ 
are identified by bullets to rake thera stand out/* 

• Exaraples are Mghlfghted 'for -easy reference. ■[ ,- ~*>-' . - 
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2se of CriteriorhP^ferenz^ tests (CSts). G?Ts ^r^lrelatl'^ oewco: 
tto tjje field of t^tisg. Secause of ibBir adra^t^^, (gls ars racsTy. 





^cleir 2jp a5nf«jl5cxi5 let as defies Cntenon-Eeferjeribed testing (C3 
'tetipg)^ A JGiT JieasoTK^ a&at an in^vi&j^l ^ or isjawsi 'eanpSir^eJ ts ^ 
ifcat he ita^t be able' to da cr iin^t JsigjiJn cjD4?r^=& sicrassfiiHy ptrfom ^ , 
^ta^^« ^S*$fcfI|jfHe-*fe3t i&ls ^^rs'fs tfcat an fndivldijal's peHonr^Ce is 
' cdrt^^f^. £o' (f^ferenced to) soire external criteria, or perftJtosr^ce standards. 
These staj^dards ara^ "derived from an analysis of vfcat is recair^d^) da a 
particular tasY successfully. ^ - - . 

• The traditional approach to testing is call^^d ftornwP^eferec ced testing 
(KS testing)- In fiR testing, an indiwdi61's?)erfbnrance is ^fjparefl 
to the perfonrsnce of other, iiodiyiduals- For exairple, any lisre-yKiFF — 

^is scared "^on ,a x^irrve,^ yo;zr |)erfonnance is being cmpared to that of others. 
Snppoi^^ iniiiyidual Xakes a KRT on his ability to r^p^nn^t^-^l/Z t<m track 
transnissSon and scores at tba 9CJth perc^tile. At best, aH this tells; 
you is thajt the individual can repair sucji a traji^ssion better than 95 \ 
6ut of 109 other iodiyiduals i^fco take the test. It does not tell yotj that . 
the indiyfdsaJ can repair ti^fs transsissich to specific t^t standards-- 
^at he C231 fix ft^o that it Kill work and Sold 4? ^o&a reasonable period 
ontijne under liora&l operating coDditicns. . A CZI on ^e S2qne siibject wuld 
tefi you nfcether or not the individual could repair the trai^idsslbn to the 
appropriate st^ards.' Scores frora this OJ7 ^migfit be. rerorded in tenns of 
'"go"' Itr *no-go.* All individuals mo csece^ved a **gp* (or a ^pass*) on the* 
GCI^ wSold be able to repair the 2-1/2 ton truck trarisjnission to the test 

'staxidar^s. You would not necessarily knovi lAether oge^ individual lAp got a 
"go" did better work than another vibo also got a *go^*l Hyt yo]> i^u?d" Jcnow' * 

.that both had fenough knowledge and skil^^to repair ^i3ch tr«isinissipns.- 

In inany cases, you om't, te^ a CSJ from ^ KRT Ju^ by looking at 
tlie testi. Itenis on both tes^ niight look the's^- fe)th CRTs and 
KRTs lay ^^e inultiple-^choio^ itenis fill-in-th'^e-rbiink ^itens. The^ 
bo$h ray 4ise,^iinulatedJperforrani;^ ;measu^ - , . 

- ^Tie.theTtounTiquet on the duniny's leg - ^ ' r 

• Desuonstrat^ proper bayonet procedures using^ the nibber rodcHUp 
. ^ K-16 and teyonet ' • ' 

or^hah^s^n .pOTt)nnance neasur^ sodi as:^ • ^ 

• Disassed)le-lhis weapon - - ^ : 

— ' - ' * ^ r / ' ' ' ' ' . 

^' > Corj:^ect #e calling party to ^.e called party 4isfcg stahdar?3 > 
field switchboard 
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A'- \ \ ^ \ ^ 

^ • W'^t sjozptoms fndicatfi t?:33^ bA alpine in^actitm sbojld 

bet^adnjinisterejd? 

or^slrfll-"^^ itais sjd^ as: 

^ Cbnpute tJjs election rsqulr^ fcr firim^ a fmitzer nxmd 
Itxkb point X to-spaclraed grid ajsrdin^ates 

Sfeplacs tfts fazil ty conoDnsnt on this radio cfcassis 

2oth types of tasts-iraj^have psper-and-^penciVpentJfTBnce itans. Fcr 
exanDle: ' • 

♦ Plot t?s2 qirickest roi/te frsip point A to polrit 8 on the 
topographic iiao ssroolled. ' : • ' 



* or actual psrfonnanoe ftesis. - rnr ey3xnplez 

♦ Ycu ai^ dropped at poiint A in tJ^e test r^ge- Using the ffiatp 
and itegnstic co«Dass pitrdded, c^et to point 8 within tra htrjrs. 



So, lofiking at a test will not necessarily tell yo:3 ytetber'or hot it is 

a C^: To detennine' If a test is a OIT, you need to find out how it vas 

' developed, vfeat it is used fbr^ and faow the sa)re is intefgratedl A test 

is criterion-refersnoed 'i f :^ ^ ' ^ ^ 

• * 

« 

♦ Tf^a test itejTS are |>asei3 upon training objectives i^Sifdr, in 
pjm^ were 'developed from perfonianc^ cS^jectives external to 
• training- That is, the^developnent of ^e test can be 
directly traced to a* consideration of the tasks .i**i1d3 the 
trainee will eventiialbs^ perfona on the^ job- 

^ • TJje test Is jjrinarily used for jueasuri^ ^estery.^ Jh^t-is^ 
the test is designed ;to-deteKnine isSi^ther or not 'the in- . * * 
dividual has iiastejred particular ta^- CRTs nay also be 
^ used to assess instractionai-^qrt>graH5; tJsat is, tSey iray , 
-inelp detennine whelAer or mX prograips do train individuals 



to adiieve inastery- 



V • Scoring of the test is based upon absolute standards such . 

^ as ^iob con?ietence rather than upon relative st^da'rds such 
9 as class standing. . * 

If a test joeets tJ^e '^ve three criteria,* it is criterion-referenced^ 
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\ ^ \' ' \ \ \ ' \ 

. j&j caa sea'rr^ this «i)1e, oti%. by '©s^nq tsting cabn yoj linrw \- 
. I irtfesfesr'aa fcdlvld^l 9s ft^ared^td dD a job* JtRTs my be able to tell 
\ \ \yo:^ jj^icS individuals are tore proved th*rt< ofters^ bjt nst nffridi are ^ 
H \ V^ifeto <39 t&e Job. , \ . . • 







ffie^iiires a carefbl ^alysis 
of skill? and toowledges 
needed for perfornrfog tasks 
on vSii^ individials are to 
be t^ted—Task analytic 
data provide tbe basis for 
the cx^nstr^ction of it^ * , 


Kay be based on course content 
layght or instr^jctcr's ^assunptions 
.of what individuals need to know~ 
Task analytic data are not oeces- 
sarily considered 


Test results indicate 
Vwbetber or liot an individ- 
\;al jcafi perfbnn a task to 
h^ptabl e standards ' 

^ - - « 


Test results indi^t^ how mU 
an Individual do& (iyr how 
much an individual knows) as _ 
anpared^to others vho 
have takfen the .test 


Test r^ults are anost use- 
ful -for ^akiog absolute 
, decisions, such as v?hether 
or not a persM- is ready 
to pei^om a pari iculaj 
Job task ^ ' \ 


Test results are rost useful 
for Iraki ng relative decisions , ^ 
sixdn as ^v'ho knows itore, "hS^o 
wofks TOre^^uidkly^ or class-* 
rank * 


Figure 2-1* Comparison of 0i toting > 
/ • witii Tes^ting , - . / 



mm TO USE CRTs 



You can develop and' use CRTs for a ^variety of purposes. The fore- * 
laost use of CRTs il?^ ang^en the question "How well can the individual 
perfoin 6o^>ared to how well he^ needs to pefforn to acconiplish a task?" 
In otter words, yaa should use a CRT whenever you, need to fifid out if an 
individual knows^ and can do what is required* in order to perfona the tasks 
for which he is ^ing^ trained, \ ' • 



^manbsr tJoygh; 30:1X11 ?ivs <^1e to aneat fevo other criteria, 
aside VroTB r^ho ^n^r ^ afem^^ ci/fes^ipr), l^gferg vou caf) ferflda CS?t: j 




\ 

* First, have to be abl* to t^e yocir test iians on traioir^ 
' vobjectives vftich were developed from perfonrance i^);5scti^ 

external to trainiog* ^, if yoy can't pdint external 
. Derfonnance obj^tfves (isSjaf ihe individual should be able to 
^ tio on the Job aft^trainino):, you can't develop a GIT that will 
-be a m^fxA ^leasure^Qf Job peffonnanj^. - / • ^- r 

♦ SetXDd, you'll t5avs td be abl^ to scj^re the te^t^ an absolute 
^basis- if the test >©n't be sosreable in tJ^is »ay— that is, 
*if you can't specify the mnimuni aosp^able standards for 

adequat^ perfomanc^ — then you won't be able to build a CHI, 

A properly a^nstn^cted CRT wll allow you, to classify th^ people 1*0 
tafce it into two" groups: 

• Kasteri— those ikrf^o you are r€asonably sure can do vSiat tJiey are 
trained for, . " ^ 

and ' • - \ . ^ 

. ♦ Kon-itasters— those «S30 you are reasonably sure cannot adequately J 
do ^Aat they are trained for. , - 

' - ' * * ' '^V -J ' ^ 

A CSTi^ then, lets jrouifind out H7»ether 01* not an individual ifias rastei^d ^ 
a .task* or skill • - . . • ' fc, 

If are interested in flMing out v<J)0 does best, *^J0 does average, '| 
and v^od&es wor5t,.;jpu should not use a CRT. In fact, v^sneYer you 
to answer the question "Hov* wel? does an individual do caT^iared to othersff, 
you should use a KRT instead 'of a On". KRTs are designed produce targe;'- 
.differences in the .scores of people taking ti3eni, so they^'ciSi be used for • 
helping ^u find out htjo dogf best, .sfecorid best, third -best, etc. CRTs, 
"""aioygh, usually donU produc^large s«jre differences— all rasters ray get 
just about the sar^ Iseor^-so' ^ey are not good for helping you put peopl^ 
in the or^er. of hgwr^eJI 4Jley do ajsipared to one anotJier. ; 

'-?"■''■'''*' ' • . - ■ - ■' ' , 

CRT or f«RT? . - ' x. 

Suppose you winted ^ test a class at the end.^f training and nase 
the two top score'rs/^s liijnor graduates. - ■ - 

" • Oaestion/-feuld you.)*ant to give the class a W or a JiRT2 



■erIc ^ ' ■ ; '1^9 



^4 



>c of t?ie cl5 




!uafe?' On 



can dd, M^sat yoi> have trai 
get the to? sojre, so y&ii 
ot3:er hind, if ymj give 




^ ars inasters— : 

SutriS people 
y as honor 
you will probably. 
Jje ine%t of ihe people^ 
tijese^ndiyidizaWTSo^ 

tasks.. '3i2St the 



find tipo indirfdii^ls yfao clearly score hi^i 
in. ae' class. \Su*, with a KRT all jirou tecair 
test OKiipared €5 tfie oSier people' vfto tx).ol; ■ 
sartljf tep» iwiether or bit t*ese two feye sns 

same, you lould have a clear .b^is for naming tSje two ifsdiyi^^ls vS^o 
Scored highest on tfee KSI as hoj^or gr^uates. So, if joa iiast/Miie honor 
graduates (or select a few people for proirotion or other special honors)-, 
y&j i«?uld.5e better off using a KIT. Bit if yo^i tent to firi^'^ciit who in 
your cla^. has irastersd the training, you had better use a i3?T. 

liow, suppose you reoefve a directive indicating that approxirately 
five perc^t of your class are to be identified as honor graduates. You 
give^e 4:1ass a CRT Kirich has s cut-off point at tbe soDre of 70, /&iyone 
y^ho scores 70 or above on the test ha^ijet the ^nininrjin acceptable standards 
on the tasks you've, trained than to pednprni. Eighty is^the top sscore pos- 
sible—it represeiTts perfect perforranciAon Hie tasks tested. - 

' * - v.. ^ ' 

There aire 1(W p^ple in your class an\i tfey received the following 
^cor^: 



\ 



Score - 


- Hia^r of 


psDple in class lift?, get this soots - 


80 . 
78 . 




20 " 

40 : - 


77 
7d 

"75 
74 


« 


10 . ' -. / 


72 
71 




■ - - .5 

5 • ' 

,100 




Figure 1^2. 


Sanple Test Results ' ^ 
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^?tow k^i.co yDu do? Kot pnly h^ evsrysi^e in c1as5 passed^the test, 

baye achieved perfSct scor^. KbicS ^c?)lfi*. 



, xt l^^rc^nt of jfcfti 

ifeald designate as bc^or gnadtates?. Yoa woyld Aava to fi^d i^jr 
o^tesr ttan CRT scores ta Identift' five percent of your class ^ honor 
•^raduatesX \ \^ '^\\ ^ 

^ • So-\if you need, ^ use a^^, >ou slouldj^ choose anong class 
siesi&ers on the hisis of results \ A?! C2|,n really say is 



>S>o can do Wfiat^^'tfeeyVre supposed to» 




OTHa USES CF aas 




Screening Devices ^ 

Another use of C3?Ts is as a. screening device. If an individual 
needs to possess "certain ^try behaviors before he starts an advanced 
course^ for exarole,' you idght Kant to give him a CRT before pemiitting 
hiia to start -the a>urse. In this case-, th© OHT would be based on objec- 
tives for tasks that the individual should oe able to perfpnn before 
beginning the course- A learner^ s pennft test^ for.exai!?)!^,. Js oft^ 
useo as a screening device in autoinobile driver licensing: ^If an 
individual passes this test it ineans that he has the aJtr^^^ level, knowl- 
edge—knowledge of state ^traffic laws— and can be <:x)nsiaerftd ready to 
begin hands-on driver training- \ . ^ 

You can also use a CRT as* a screening device p) see if the individual 
ready knows how to perfonn some of :t^e tasks. In sonie cases,* an- 
individual ray be sbVfl^ do a job without taking a traininjL .«>urse be- 
cause be has had appropriate pasjt experience, or was trained for something 
siinilar. For cases Vike this^ yhu m^J: wa^t to test this individual at 
the beginning of the course {or block of instmction; or sub coursie, or 
specialty aynea) with tte^same CRT you would give to t^je rest of the class 
at the end of the course {or Slock of instruction, etc,). If the Jn-, ^ 
dividual achieves a inastery-lfivel score on the test, then you won*t have: 
to waste .resources or tine bjf putting hiia through sccaething that he can 
already do. , • 



Si agnostic. Aids 

J . CRTs inay also be useful as diagnostic aids. You can^ build a CRT so 
that it shows just what objei^tives an individual ts%weak on (has not 'yet 
nastered) and fitven what particular steps of a certain procedure he is* 
unable to perfonn. A diagnostic C^T on. drill and ceremonies, for exai^le. 
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sbohT aat an indiyidual cannot correctly execute *'paradl>est** By 
«\aaining tJjat Indiyfduar.s t^t-san^ ^eet^lyoj might fitid i^t he failed 
pa^ite rest because he did -oot hold his head -and eyes at the po^itixm of 
attention. Reraediation for this person becccaes sirple: You dDn't have to 
teab^hi;^a11 ihe steps of p^nie rast—his feet, anss and hands are in 
^^corr^Bct i>ositions— you only have to t^ac* hiis to hold his head aj^d 
^yes at Attention, Of course, this .is an overly siinple ^*^1e, but the 
principle^bolds tnie for mudi rare coir5)1icated taste, such as flying >a 
helii:opter^ \ . N ^ ^ ' * . 

Eyaluation of Instruction . - ' ... 

A final, iiajor use of ClTs is to answer the question *Has instrac- 
tional program taught i^at it is supposed to teaiA?* TJiat is, you can 
use a CRT-to e«luate how good an instnictional program is. ^f you have 
designed an instructional program to train people in specific tasks, you 
can use a CRT to find out how good the program Is, as follows: 

♦ First, you find an appropriate group of people ySio cannot do the . 
tasks— the CRp should show tliat they are non-irasters-on thtSsa 
tasks* \y . • ^ 

« Then these people!^ through the instructional program, 

V ♦ Finally, you tlst=therii with the CRT again. • ' 

If\he instructional program is good, sost should score as irasters 
on the CRT after they've had the program. 



ovERyfa OF CRT miismcTm process 



1 



^ There is no single correct vfay to constnict a CRT, The construction 
process outlined in this section is designed to help you constnict and 
use ■CRTs thai will be suitable for their intended applications, follow- , 
iig this process will help you cover all points necessary for .an adeguate 
test- However, your own inaginatipn and ingenuity will be required to 
create workalile tests- The process presented in thts, canual is designed 
to fee applicable to <!i versa types of testing needs and situations, regard- 
less of subject catter-a^us, you .will need adequatfe kriowjfedge of Hie 
suiyect catter or access-to sAject ratter experts- 

'Reraenber, the CRT construction process presented Jjere, is ohij one 
way of constrjucting and tsing CRTs. There say be other useful approaches 
which you have been followiiig. Consequently, regard the inforca^n pre- 
sented within the stras of this process as guidelines to aid you, not as . 



\ 



absolute doctrine. If the prosc^s crafiVcts with your pro^^r^^ use 
only those giridelin^jjSiidi help V^^i^^ J\t p the other hand,^yoii are 
starting fron? scratch in the^test dev^lbpnent proems, you wilTvfind the 
CJT construction procedafts presented^ here to bS a siinple aj^d erficient 
method for ranstructing CSJs ^ttmV) do ^e job. Here is a bri^f outline 
of the iiajor steps .for constnicting, using and evaluating DTTs. They Mill 
be described in grater detail In diapters 2 tiirough 7. \ 

1. Assessing Inputs to the C(T Deve1opn)ent feocess. In-±his step you^ 
assess the ad^uacy of the Sbjectlves that you will in develdging 
DlTs. Inadequate objectives utist tfe^revised or discarded- In assess- 
ing the adeguat^y of objectives^ you ^11 carefijlly consider their ^ree 
min parts: . " ^ 

• Perfonudnces—idiat the objective re^'rt&s people to know and do- 

• Conditions— tee situa^ixjni und^ wfttcfr people's perfonnance 
will be evaluated, ^ 

, • Standards—the Igvel of p4rit)nnam:e which indicates satisfactory 
achieveraent of g^e objective: 

2. geveloping a T^t Plan, Befori Writing test itenis, you should plan 
the test* in this step you develop a test plan by considering a 

. nuRnber of factors including^: * " 

y Practical cohstraints— do f actoi|s "Such as tiine and manpower 
av^ilabi^.nty, costs, etc. affect the \^y the test imist 
be built? - . ' 

• Its3 fonnaf—a^'^^e objectives best tested by written ita*ns, 
perfonnancefitsns, measures of how a perfomance isP done^ 



ures of products rasulting iroia perforaance, 
istic should the te>t items be? . ' 



ineast 
real istib 



• Kiiifeer o^ itans^-how laaAy 
objective?- Khatjcinds af 
' include?- 



3- instructing the Itea Pool* 
for your test plan. Iffiei 



etc,? How 



items Should be inade for each 
conditions should the iteras — 



In this step you create the itetns. called 
iver your test pt'an calls for one item, 
you create ttfo^ In this w^'ypu will create an iten pool froci which 
the best itecis^can be sejjrcted^y trybut and revjes* procedures. After 
yt^ have prepared a?l-%Ke itesns for your itein popl^ you assess thV 
adequacy of each item considering^ such factors as^ * 



V 




• OoBS it natch the'objectlve for^fch it was created? 

• Is it dear and" unad)iguous't 
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♦ * 



• is it ireasisn^iy easy to^^dni^^nistsrj 

• !s It at the appropriate level of 
test plan? . - ^ 



tllsra as. speciried in thg 




in this step yo'j also prepare- iestructions itSiich tell bow each it^ 
is to ^e administered. In additicp^ general instrscticns for the 
test as a idiol^ cwst also be developed. * ' ' . 

- ' * 

Selecting Final Test itess. In this step yon- try out ^e iten pool and 
obtain reviews of the test iteras. Poor and^dundant test iters are ' 
revised or discarded, is, necessary. Ycu snay also have to create apd 
try out new iters,- if thfe first tryout and reViews eUainate itens 
vjjich leave gaps fif the test'plan. - ' : 

— . . . 

' ' AgiainistenaQ and ScoH^g thel-Test. In thiS' step you creats-^tRe 
scoring standards and adnrtni strati ve procedures for the t&st. You - 

*. develop and document standartiized conditions for test a^nistration ^ 
so the_test cari be administered and scored bit others using your ^ 
docun>aitation. You also develop cut-otf points for yoa^test which 
^11 what a . pas'sing. score on ihe test {or on each of t^jk objectivss) 
•is, • • . . 

-asurtng Reliability and Va3idity . In tiiis step y^ evaluate the ' 
liability of your test— that- Is, you find out if /the test J^easures 
e sarae tiling' each .tine it is given. " You al.so eyilua^ the val1di*,y 
your test— that -is, you deter?r4ne whether or pst it is actually^ 
cieasunng ^hat it is sypposed to nieasure. If yjj^ur test has low 
reliability or .validity you niust consid-er >raiyS/Of lEiprovmg vhe tes*- 



•. ESSENTIAL STS^S ' ^ 

Whether or not you use the CRT constriction process step-for-step as 
described in the manual . you should be sure .that the following essential 
points a're coyered In developing. and using tests:- 

• Test Itess should be developed to reflect the attalnine'nt of 
objectives, which in turn are developed frpni independent 
analyses of the tasks. Test iterss shoal4-aeas4jee^;?ier- 
forsnance sbecified ^h the •;pbjectiyes , under the appropriate 
conditi'diisLto the specified standards. " • 

- - You should rake sure.>tiyft yoiir-test iters jaeet the practical 
constraints of the training and testing situations, and that 

yotf-4^ our your test tens. Trying oat Itess is the only 
certain wij^of-fiiiding^Bt which iters work best. 
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. . • ' ^ . ' 'S 

• Ydu should review the faults {)f the tryout^awi fivaloate - 
' tJie itims vitJi peers, tsst evai^jation tznits and subject- 

latter lexperts. ' C 

• You sb0uld proviife appropriate adnnnistratioQ ai^d scoring ^ 
$>rocedur^ to 6e used with ^br CRT to ensure that tise 
CRTs will be adnrinistered and used in a wifora. 
appropri^ Kay^ •* 
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J:ihstJ>tmJtk ^:^^.e*^rj2^^e?op:^t. prt^s are callsd obj^tives. 
W^e^^'ti^.'o&j^EUves that ^Tl What peoole imzst do 



are 



!j3jpgc^^?gf, and^the i^s «^ch.3re 4^elop^ froju then, can be, written 
at sevi^^^T^^ of.4e'tcL^'It^ important ttf^gi^sp what iftese . 

j^el/"ii:e1>ecajjse ttiey infljueiifce ^ts ,ace j>ri^a^«d and used. " Under- 
standing these^levels-can'^alsb help i^u Judge Uie adequacy .6f the objec|ives 
froavhich tests niust isf denVeS;."- -. J . . , ' ^ . 



three basic levels can be Identified: " * . / 



♦ Levels! refers to i3iy|^t1ves Khich are i^repared^on the j?as1s of 
ifectfine and/or expen ence about actual ^' ipeap l ngful u n its of wr ^ 




^4 



Pairronia^nce' Rfgiqre^ (iJPRs)' 
r,l •- Perf oraance Measure (fes) 



- /• I ^ . • Oob.OhjectlveS (JOs) ^ 
I « : \' • Task Objectives' 

\ The exaict jabels are 'not important, irtat is icportant is knowinfg 

\ . that 'Lev€3 1 bbj e<jti yes. refer iaear^ji^T Wts of worX activity 
. perfonaed under operatidnaT c5^rKiitions, and accor4ing to operatlon- 
I al standard?.' Jfm'is, LeveT 1 objectives teT3- what jciTst be .done - 
' on^ the,Job. • The job-tasl: analyst is- principally respoiisibje: for 
such objectives^ • .. - - / ' ' ^ . 



1 



y '• '2-T. 



.5-";. . • '. "v*:- - • . -.>5-' ■ •• r '.■ 




• laval 2 oljisctives are essentiaUy t^a? 1 objective vfoira baye 
bean ite^ified S^'-tbe trajnif^ systeni CT^to^.tbe traimsq ^/rograiB 
designer so Sat tfeey orate!) trajoing re^rc^* apd safety require- 
Treats^ Level 2 still refers to jreaningfuiSinlts of wcric artiir?^- 
JDbJectives 1b this category have been labels 

- • Traim^g Cbjectives 

♦ 

' ♦ Ir^tnscticnal Cbjectives 

• ifsstructicnal Goals 

• Learrring CbjectiYes 

* • Temrinal traioing Ofajectives - * ' 

This level describes work activities wbicb cap stand by theniseU'es 
. and still be meaningfaK far exaniple, operatlfig a tnultiineter would 
be a Level 2 objective, jf the ii^tentidh vere. to' t^ain assen&l^^ • ' 
line -woriiers'.to parfom qual ity "checks to itake sure tb|t imjltimeters 
are operating properly' before th^ are pacJagsd fpr shipinent* 
IjoWe/er,- operating a imaltiiieter is not n^ressarily a JTianingnil 
activity, apart froni troyblesboi^tijig a iralfiiDCticniog electronic ' ^ 
circuit* Operation of a aaultiimeter in t3iat case^would be defined . 
" as"^ Level 3 objective, Which ^dll be-descri^^ later. ' ' ' ^ 

. ■ ^ ; — / , • -.-^"^ . . 

113^ point is: Level 2 objective tell -^fe^t a person ^jst be al^leX 
tg^dp-At tfee enTof training^ not necessarily in an operational^ 
CTvironnient (on the Jobj.. ^nile the training prograjn designer is 
principally' respoftsible for these objectives, test deifelopers .haVe 
iniport^nt contributions to itoke along with job 'task analysts and * 

• -unit ccrafianderi. Testing at this level is iiesi^ed to scpeen cut. 
individuals who have not masked the objective{s) of a particular 
stage of training. • : ^ - . . < 

• Level 3 objectives refer to activities, (conponent skills and Icnowl- 
edges) which ai^ not directly useftiV by theiaselves. They ar-e^ ^ 
generated in an attempt to^rake training eff icieot *and iiafiageable. 
Labels Used* at this level include: . ^ ' . J 

. •Enabling Objectives ' * , - . 

• Knowledges - ' .'j ' - - .^l"^'^ 
♦Skills . 

_ • ♦Jntennediate Objectives - <r .7 * - \ . r-"^***' 

. . •Learning Elements (senta3, physical ^ ijiforrat^on, and - . 

; _ attitude elements) L- . ' ' ' >i * v ^ * , 7 _ 

, - . 27 ^ . - • . ., 

cDir ' • 2-2 * - * - • w ' ' 
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THE IKHEr ^IIS PARIS £r e^3Hn"IV£S 



Ssfore constructive a Crir, it fs'rec^sayy jto^tate a close lock at 
the olyectivals) on uSiicb the is to be leased. Jou must tJiorD^ghSf 
c&eck eadb part of the <6iective. A prpperljr witten objective, re^rd- 
5ass of level , shcriid consist of tfce*fo11owit5g tfcrae parts: 



♦ PerfcniaDce Clask) f 

♦ Conditions. . • . 

♦ Statjdards • t . . 



^ You are probably already fann^iar with^tfesse jarts of an obj^tive^ 
. . ."dot you ifey Jintw tbeia by otSsr lasnes. 'jTigixre 2-1 shows sorae iJf the otJisr 
labels fcy vhidj tJse ^in ^ns of ^fejectiv^.^re ^^entiti^^' 



Perfcniance 



Conditions 



Standards 



♦ Tasl:^^' 

♦ Action 



^-0%^ coj5di£i6ns 
-•Environmsftt* 



• Skills » knowledges, •Jobls and egoip^ 
^Subtask . \ ^ 



♦Training standard 

•cAl^dn Cplsraf^ 
criteria) , 



♦ 4.tA slatndards' ^, 
•l?aiicinsrC(H)ditionsf- *"^Pas^/fai1 standards 



♦ Objective (sonetisnes ^^lob aids* . 

• used as a l^el for 
l^erft)niance only) * •Jfeterials required* 



♦^,— standajftls 
* - * <• 



♦fiotes* 



^^e are specified kinds of conditions^ an of-i^hidi go to 
cafce up conditions as ? whole. ^ ♦ \. : 



*fi^e"2^K Scpe Sy^iQnysis for the Parts of an Objective 
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let iis-rcomider sath of t^ese min parts separately- J^fter this ^ 
- irill look at examples of to» to divliS objectivas into i^eir Sl^fee parts^ 

... . / • • . 

. * . * 

Every objective s!30j1d state frecis^y ifhat ifdiyidial.iiust da* . 
The statesnent of perfonnante^must be clear enoi^ for that perfcrnante to 
be trained and tested. Exanpls of perfomances stated in objectives ^re: 

• Clifrfb the taiephme p23e - - ' - 

• Disassenble an K-IS rifle, 

• State the' corsdit ions ftjr v^,Tiictj a toymi^^uet s^iDuld be applied 

• Canoyflage the helmet. . 

"•Add two five-digit i^umbers' . * . 

Hote^ U3at_ every statenent of perfornsaoce in^^ludes an atticn verb- This 
veiij usuailjif fs.ti^ ley to the perfornance. It tells i^at uiyai be done- 
For exanple, in ti^e stateiaentof perfornante ""istate the conditions for 
Whidi a toumiq^jBt should b# applied^* the action vert> is ^state^"" Ypu 
cdh test* the sttident's ability to state conditions. .Sup^pose tbat 
staten^t of perfbmance had read ^Appreciate tfee conditions for which ^. 
ttxaraiquet should be applied.** Mould you krow umat to test? How would 
yo« kmw vhen a stiident ^appr^iates* the conditions? 

— ^ Soiretiines, thoifgh, the action verb is' not; the key to thfe perfornance 
to bB ti^ainsd and tested- It' iiay be only the indicator of the perfoniance. 
Any^^tiine toat you can't point to the perforaaoce itself, tte action veri) • 
should. specify the appropriate i<5dicaibr of that perfonnance. For exaiaple, 
coiisider the statement of*perfonnance ''Add two five-<ligit numbers. ^ It 
is clear thst the perfornance called for is "adding.* Sut how do you Xnq^ 
when sonisone successfully adds tafo nuiabers? Obviously, an indicator niust 
be supplied,. sinpe you can't observe the act of adding. So you'would * 
attach '^ao indicator to the state^nent of perfornante; i.e. i ^Ai^i two five- 
digit.nudibers an3 write the answer in^the "spate below.* Ifote that although 
*wrrte. . is the observable action^ tiie inain intent of the performance 
is adding, not writing^ If th% Statenent of perfornance calls for, an 
action (has a nain intent) that is not directly bbsen^able^ an'appropriate 
indicator must be added. Ite will discuss irain intents and indicators fur- 
ther the next section, •'^Assess img the^Adequacy of Objectives.*' 



£yery c5jective shD^ld iotflJ^e a stataient of t5je ccaditiias ^j^dar 
itfbich S>e p s ar fctoa nce inast ^knonstj^tsd. Such slatenents s!3D:ild 

♦ iihat tbe studant has to w:>rk wftft— wnat he is allowed 
to use {tools, rsfei^rjce laterials, fete*) 

'^•^lanoe irust be-denonstratsd {nlghttine conditions, class- - 
"rooiB conditions, etc.) 

• ifoat t^e student inu5t vcrkr on— liis szartlf^q poists {tne 
*givens**~e*g-, gi\'en a Kark'il Cassis. . .5 

^ •Any linritaticns, special instrjcticns, etc* 



it is .very ijipprtant fcrv.an objective to specify all condltions^idi 
my affect perfonnance- iSia^out statenents of these conditions, yoj can*t 
be stzre of Jast what to tx^in or to test^ Si^ppose, for exaiip1e,.t3»t an 
obj^tiye stated *Be' a5)1e to disassenble asd reassemble a^'tl-otl mthit^, 
gun J* ¥ixi, tiie foot soldier, read the cbisctive, recsive trafniog, atjd 
are ready to be tested^ Your drill sStgsSf taJies yoa into a ^odowless 
room, closes the door, hands yoii^tbe jiBtAfffi ^-in, turns off tie llc(bts 
and says ^GJtay, disassenfele and reassemble '0iis vapon.* 



You say, *But Sergeant, tba objectivf didn't say a^yi^ing about doing 
it in tha ^ari.* He answers, Hais^s a. confcat weapon and you^pight have 
to use It aqytijne-«-iri9ht or day. i mn^t .$2*i5^. be amind,to turn tiie 
lights on for you.* . ^ 

' y • 

Sp, if OTnditions aren't ^^^ed, the student -won't know exactly 
vrfiat he needs to leara to do^^-^A!^/ as a test ^veloper, you won*t fajow 
Just what it Is you shouljl -ifeest;^^ . if j^ou read, the preceding objective, 
what jco;^itions would you test isder? Oay? Klght? Classroom? ^Jain? 
You wdul2 have to nake an edacafed guess because y5i3 really wouldn't know. 

Often performance cwtsj be 'demonstrated under JKiltiple conditionis. 
These inust be specified. For exauple, if a student mjst leam t^ fiavigate 
through nany different typfes of terrain^ the objective should state each 
of the terrain conditions -through xhfth p^e student will fea^^ to find^ his 
way. Sometijaes-F^rfonnance hdSl be/demonstrated under aily of Several ad- 
ditions. In such cases^ the statemait of coi^itlras in the obj ^tive ^ 
should nake clear that the gei:fcrn:aiice rieed be demonstrated under oply one 
of the cdnditfons. For exaraple, zn objective requimig a trafnee to deter- 
inine the coordinates 6f :^ grid on a rap ray state *Tbe trains ray dojtbis 
iifdoors^r outdoors.* 



*^ Tfee following staS^areats re?r^e!it sinre ezanDle^ conditions (stataaents 
€rF conditions are sJiiderlioed): 

^ • Siren the volane of a sghers srd t^a a':>vrovriate fbniajay conp^te 
tbe diameter of e^e spSsne, 4. ^ I * ' 

' . » - • - 

• Cross a stanferd obstacle oars?, if) the raia> 

• List ^2 Jt^r coitpDoants of ar^ee? clutch and their part ounibgrs,' 
using the rerer^nce itassal grevfged. ^ 

ft 

•if^lace the t-ransislw this circait loarti^ yithDJt ca:i5inq bat 
dangga to the^edb*at^!rt crystal diode> ' • 



Standards 



Each objective snoiild specify the standard {criterioo) by which 
perfornance is evaluated* 



in otijer words ^ every objective should iaiicate how well or bow 
cpickly (or bsth) a perfonnance iiust be done* As is the case for state- 
aaents of pe r f on iaxace and cx^nditions, sta^^dards, too, mst be clrarly stated , 
in the objective or. you won^t know how to train or test. For exainple, 
supps^e an obj^ive only stated *Be able to type reasonably acx2;rate1y 
usiijg an electric typsi^Hter untfer statidard office. coAditions.'^ Lacing 
standards ftnr speed and a<xuracy» hpw fast would ^ tyain people to ^pe 
in order to satisfy the objective? H5w fast would th^ have to ^e to 
pass a CRT? tfbviously^ the oBgective is lacking a clear statement of 
standards ("reasonably accurately* doesn't really tell you anything}^ A 
complete objective anight read ^Using an electric typewriter io standard 
office conditions, be able to type 50 words per nji^ute corrected 'for 
accuracy (one word per jBinute subtracted ^r mistake}.^ iforJcing 
froni st^ ^ ofajectiye you would know what standards to shoot for in training 
and the level of perfcrnsince a persSn tes to denJonstrate on a test- ^ 

There are six specific types of standards that can be stated, in objec- 
tives to indicate how well (quality) or how cjuicUy (tljne) a perfonranpe 
OTfit be done or^^a prodi^ cqnlpleted. Figure 2-2 describes these !;^5es'of 
standards* ,'An objective should specify at leasi ooe of the six ^pes of 
staxidards in order to be coicplete. Often an pbjective.will conibixje sev- _ 
eral types of standards; for example, one of quality %5d one of tiine* 
sp^ctficatipns- " . > - , 
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imiro speed. 
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It Is. Important fdr t&ls trainee to. 
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calcalator9 hence tbeitfme Aquirei* : 
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Figm* 2-2^ Six T^es of Staniar^ " ^ - ^ 
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*^ SBPar^tim i&iectiv^ Into TSeir Three garts 

. ^' It IS a relatiraiy iiatter to separate an objectiva into ats thr^e 
• parts., Let^s la>k at a ccnjple of e^ongjles of daina t&i^-. 

. ConstiJer JMs^b^ective: ^ *6iven a nap with twD }X)iiits ciixIbS and a . 
protractor, be able to anasure iie grid azixinith frbin point A to point B 
. aM sfeite tbe correct ansi^ Jvithin t J2 d^rrees) in 123 seconds less** 
• -sJere is tow yoo' wuld divide tfee jobjective into its three larte: 

• Parlfcrnance* One perfbniance is tailed for: /"Seing able to 3iiasure* 
grid az'iupjt^.* Jtote tfeat. ii^sijnnq azimuths is the ira^ic int^t 

<}f the objective, while stating the azimuth is t^e indicsdbor of ^ 
the perfornance. You isould l^ve fjo doubt about the tralfjS^s ability 
to state soniething; iAa£ ybu want to know is if fee can a>rre^ly 
_ .^-.seasspe grid ^iiuuths (kit you^l t>n?y i^iow tiiis if he measures 
it, then states it*) . QthBr icdicataDrs uright i jJcliide 4«*4tJME::£h? _ 
grid azimuth, checking 4he correct answer vn a inultiple choice 
list oi^ five alteniatives, etq. ■ ! * 

' •Conditions. The coirfiticns stated irtthii objective are'^givens,*': 
that |s, the nap with two points circled and a ^rotractcr. Snrlr- 
^ ronn^tal conditions are liot ii:5»rt^, so they "are not stated. ^ 

You oould assume^ that the trainee itould .^ave to be able to perform 
this task tinder any ordinary conditions—Indoors or outdoors » in 
bHght light or relatively jdiiiriight, etCi - . . 

• Standards. Jm standards are stated in this objective,^ first, 
the trainee must state the a)rrect grid azinEitt within 3 2 d^rees. 

. Ihis is a fminiinain' acceptable level*" standard. 'Second, tJ^e trainee 
jnust perfbnn tfre task wi-ftin 2 minute. Ihis is la *time rjequire-^. 
* ment^ standard. • • . ^ • . - , 

I 

. .Jtow consider this objective: ^iJsing an jK543 wrecker arid arf K-€ sling, 
the wrecker operator trainee will be able to, operate the hoist as directed 
. in unpackaglng the Hqnest John ferhead section following the sequence , 
\ specified in IM 9-13^202-12. Perfonnance will occur on an outdoor, flat, 
. iwird surface.* ^ • - . - • . . . * , . . 

■ • ^ 

Oiviqipg this obj^ive into its parts: \ , * 

. . •Perfomance. /Operating the hoist. Here the main intent of the 

objective caijrbe directly observed arid needs no indicator. 
• r - ' 

• Conditions. ^ There are:several conditions stated throughout this 
objective; cjonditions are not clustered in one part of the objec-. 
tive^ First, the equipoent to be used is specified.^.. Second, the" 
«aterial th. be operated on Xthe warhead) Is specified, third, the 
environmental conditions are deSpribed. And finally, sp^al in- 
structions are jcplied: t^ie ^trainee will he-directed in his 

^ ^ ^ 2-8 - * . " ' 

-!H , . 3? _ . V - 



i^erayoa of the hoist- So, in this objective, all four types of 
.condition statenientsr (k^at student fas available to »?orl: with, idat 
he is to vorJ: on, environtnental cffcussta^ces', and linitationsi 
special -io^tractions) are used. * 



^ • Srajjdardsi. Tn ^is objective, the standard is -of the sta^aa^ 
operating procedtn^ tjpe* In order to satis^ the (^iegtlvfi, the 
trainee mtst follow the sequence .specified in ^ aporopri&ia tedi- 
nical jianuaT for the Honest John ?ocfet Sistera. A|Fstecs in the 
sequasoB iiEist be xomplsted. Jfo tine standard is^ggested in the 
objective, &t ^roi could infer tfet the task'aiust i>e perfbnaed 
within r^onable tiiae limits. 

_ ^ As you have seen cbjectives ray or iray not be *nratly 'fackaggd." That 
IS, you nay have to dig a little to find tfi'e perfbrroance regyired and- to 
organize the conditions and standards Qjat apply, and expresslaem in teras 
of perfornsBces %(ii|di can be observed. To be suitable for usfe in develooing 
test Items, an t^Jective cwst contain exjjlicit statenents of perfonrancfi, 
conditions and standards, if it doesn't, it ion*t be iirjch/toelp to";^.""'' ' 

. V - • ^ 

Just having the essoitial 'three parts'^ fxwever, d'oeSn^t autoratically 
make an oMective suitable for test develc^Jinent purposea. 'Objectives can 
Jaye all tiree j^rts and still be inadeggate. . 

. » ' /SSESSks THE ADEQUACY OF IKE eKJECTIYES 

There zm^ffwrimSor cbeckis that you should inate in assessina the 
adequacy of objectives. These djecks will' be facilitated by. working from 
jrour list of objectiyes broken down into fteif tJiree parts {perfoniances, 
onditions and standards). The ch^ks' incldde detennfning that: 

. * Each objective «5vers a single task, and is not a oxnbination 
of tasks: .. . . • , 

• The 5iain intents of objectives are clear* . " \ 

• VerfoiXT^to, indicators are sisple, directs and pai^ of ' 
nhal UiB trainees 

. ' • PerfpfsiJanees, ojnditfoos-and s^^ndards. aye specified \n 
-fiJ^cise, operational tanas. 

' Figure 2-3, a/oldtmt at the 6nd of thiXchap^r, shows the sequence 
of operations for cfieclcing the adeqiacy of wpor objectfves* UeniU dis^. 
cuss each, type of check sepdnrztely. Please To3d.out iigure 2-eai this 



j d^jcfpT Hat Objectives ara'Pnljary . . * ' ' . / 

f ? ■ . / ' , * •. ' ' ■ ' ■ 

r , J3>0lcin§'at Figure 2-3, you can see'tfet if any objactiye givan as 
• *^ ^ devglopnent process is lacking one or aiore of ife inain 

fBrts-^perfomaDra, conditions or stanthrds--j^oa cannoi begia ta assess its 
ade|iiacy. Insiad, you Jicist said Sodj incoinplete objectives bacK through 
dannejs ygqi^est clarifipation. li ym think you <^ fill io the 
jstssing ^rts of syfch objectives'- you Jray do so,'&it send tJieni feci for 
approval* Kben you have recfeived clarification froi3.i3>e origfrators of • 
_ tJ>e ob|ective{s),' you can begio to assess their adegaacy* ^ 

it is iii5)ortaht tiiat the objectives you use to develcg) a test are 
unitary—tiiat each ewers ong t^sk only. It is much core diffioilt to 
write test itenS for coir5)ouad .objectives- -iijose covering -mvz than one 

_task. figure 2-3 shows that if your objectives ^ch Qover only one task, 
you can proceed to the next -^tep of assessing their adequacy. However, 

,any compound objectives must first be broken-down into ani^ry objectives 
h^foTQ^ proceeding- . ' - , ^ 

To check fiiat objectives are unitary, you should examii^e jfche parts 
/ that describe iSie perforcance* (Remsrnber, tJiis isay be.l^eled as "task,* 
^aption^"' etc.). So, looking at the perfonnanc^ called for in your objec- 
tives, ask yourself the follcninng'questionsr * 

^^Ooes each objective call for per^raaijce^on ju^t one ta^sk? 

. r ^ ^ Are all tasks independent? That is, successiiil performance 
on one objectivie does not require successful perfonaance Dn . 
; ^ preceding ona- ' , . ^ * • 

If your answer 135 either question is a definite *no,* your objectives 
are proljably not ynitary, and Aged to be broken down into unitary ones^ Do 
this by tarefaily subdividing tJjenj as, appropriate. Be sure- to seek veri- 
\ ficatiori, tbou^, fay Jubmtting your list^of unitary cAJectives through 
channels to /Uieir originator. . — • "-l 

- .Renienfeer,' vffien subdividing ojitppund objectives, into unitary ones^ all * 
that is broken down is the "lisJc" (perforsnance^^'part of Hie corapound objec- 

tive. Fach tmitarv nhifirJav** wi7 include tfire same CQnditions and standards 

^ as sjsf^jfied in tfii"cOT9)pi«d^o55^^ ^ 

JLet/s Took at ^ couple of examples. Tirst, here are the perfonnance 
parts of three objectives, eadr of i«hich appi^ar^ b& unitary: 

ina activities for ataintenance of SPiHoiritzer . 
Tfied in the operatitms,-and aarintenanfe aanual; ^ . 




2- Perfom the ap|?roprf ate befere-firiog i^ti^ for ^fce 

3* P^rftJnn the neisssary b2fore-</per«tioi3 service activities 
on the SDwitzer as specified. • . 

Sots that eadi of these pbjectives €»vers a single, seiHfate fesk:' 
-{!) caintgaar^e task, (2) set-up task, and -(3) service task- Each t^k 
js relatively indB|)endait of t3ie ot3)ers._^ ConsegasntTy, tiiere Is no need 
4» br^k ttse 4>bjectiyes down any furti&r. 



( 



consider tfce following objective Khidj read in fart: 
1. Treat for sht>ck. . . ^ 
2* Treat for herve gas inhalation, . ^ 
3/ Administer liDuth to ^utft resuscitation. - 

4. Control, artenial bleedingl^). . 

- - \ i ~ ^ ' ' • • * 

5. Give first aid for bums; chest woun^; abdtrainal wounds; 
head, f^ce, and neck' wounds; and -open arro^^ifd open leg 

' fracttjres. . A ' - \^ ^ 

6. .Correctly apply \ tourniquet and consta^uct a 'hasty litter. 

Kote aiat objectives five and six call for perforaance on several diffetreht 
tasks, while the other o^ectives a)ncer^ single tasks. In addition, .iJiere, 
is a lot of overlap—lack of 'lodependence—among objectives: For e«niple, 
conjfcrolling arterial, bleeding is a part of what ihust be dorie in objective 
five, Vhile treating for shock ts probably a)inroon to all objectives. 

/ If one were to tny to rakfe tiie above six objectives unitary, it iaigh£ . 
be done as follows: ' s - ; . 

1. Treat for nerve gas inhala^on. . 

2. Give first aad for "bifrns* . V " * ' * 

3. Give first aid for chfest wounds. ^ . • ■ • - 
; 4. €ive first -^id for' abdpnfnaV woahds^.. 

5- Give first aid for head, face and neck wunds. . - 



• Jo. tr^t opeg armband open leg -^^"oiresrt^atSSdA^^c^^^ 
' ' be controlle^f^' direct pressure, digital pr^sure 
. pressure ^wints, or elevation). 



7. Construct ia Aasty Httsr. 



&»• Admimster aiK>iiUi-to-3aDut}) resascftatfon^ 

5^1.**® ^jectives are more nearly In^ipendent and cover separate, single 
tasks- Sote that app1y% a toaraiqest is incorporated in objective six- 
it is not really a separafe task, it is a nonial part of treating cotnpoand 
fractures *fiiBre blood flow carniot be othsrwise co;itrolled. Ala not« HAt 
objectives five and six say each seen to cover' several tasks. ■ Hjsy really 
do not: first ./aid- for bea'd, face, are! neck woui^ Is one task— procedures 
don t ^^fi^^ The procedures for treating open aha and opeii leg fractures 
are also the sane.- ^11 tasis covered in tte original six objectives are 
now OTvered in a unitary fashion 'oy tliS eight new objectives. Jto per- — 
^rcanc^ teve been ctei^ed— only broken down into unitary perforaafices. 
Tl5e conditions and standards for each objective will renain th» sani». 



Checking for Clarity of Main hit^ts ■ 

■ The next check is to ensure that the iiain intent of the objective is 
clear. To do this; look at your perfonnance stateaent for the objective. 
Then ask yourself: ■ , 

• "Does tiie perforcance statesnent call for that performance' which 
, desaoiistrates liie objective?" " 

if you can answer this question affirnattvely^ the nain intent of your 
objective. IS clear. If your answer i? "no," perhaps the perfb^nce.called 
for misses the. mam intent .of the objective, or possibly does nSt provide 
directly observable performance: In eitter case, you should stake sure that 
.the mam intent* itself is clear and is .defined opera'tionally. ^ 

<• ■ 

Here are some exan^jles of performance statements in which the main 
intent .is a clearly specified, directly observable performance. 

•"Cross a wire obstacle. > The performance called for is crossing 
- a wire obstacle arid that js the main intent. Crossing tiie wire 
■ can be directly obsfeirved. "■ . 



'?Unlocrthe security container. . Unlocking is directly obser- 
vable, and the objective's main intent is that a person be aHe 
to ufflte^ the •c^nfeihe^. • »- - ^j^P^ 



Here is an ^xznple of a per for n a nce slateneDt in iMth t&e isain intent 
is clear but tfee perfcnnance called for is an indfoitor: , 

• ^Circle the picfeire the' proper shears to use for cCftti^ 
atCurjred line in sheet iietal, - - • . 

Circling the picfeire is the perforaance caMed for, but certainly not the 
asaiD iiitoit of the objectire. The iaia intent is clear, aough— knowing 
.which l^pe of sh^at^ to, use for the ,task. Jf the objective wanted 13)e 
individual to know ^kich type of shears to Ixse and how to_use thea, it 
isi^t have been sta^ as $>llo«s: , - 

♦ "Given five different ^pes of shears, select tiie proper, 
sh^rs and cut a an*ved line in the piece of sheet metal.* 
In this case tJig snain intent of the pe r fo re ance is cutting 
a curved line witji the appropriate shears; t^ere is no 
indicator. * ' • 



The following are exaniples of perforaance stateuents in whidi the cain 
intent-4s unclear and no indicator is provi<Jed: ! 

c ♦*8e aware of- techniques fpr settyig-^ip-a-drop zone. . •* « 

*3ein^ aware** of somethin g is vague a nd anfoigobus.. ?Jc« could a l^ainee 
show that he is •aware*'? What actionals called for? Does the objective 
want the person to be able to set up a drop zone, or supervise setting up,' 
or teadi how to set up a drop zone? You can't tell frcsa the perfonnartce 
stat^ient because the inain inteiit is unclear* Also note lAat there Is no 
isdicator provided which would tell you how to measure *befng ai^re** 

♦ Desjonstrate an understanding of the differences between 
treating a sinple fracfaii^a and a<:6nipound fi^cture* 

As in the preceding exainple, t^MNMn intent is unclear; you don't /eally 
-know the .puipose of the objective^W^re you supposed to find out if an 
individual can treat both ^pes of fracture^ or are you supposed to see 
if 4 person, tries to treat a cpn9)0und fracture like a sixnpld-one? You 
can't tell* Alio there is .no indicator to help you figure oiit how you 
are supposed to Eieasure the "demonstration of an understanding," So you 
really don*t;nave any idea of what perfonnance is called for^ thbujh at 
first glance the. statetnent caay have ippeafed to actually state a perfonnance* 



fiFinallyT- let's look at some exacples of performance staleroents with * 
clear indicators but with unclear main intents. ' 

- — — - — '. - - 'I 

It is iir9>ortan€ to 'know whjat the ©ain intent iSi even when there 
Is a clear indicator, otberwijse^you can't know whether the iiidicator 
is really acteptable because you don't know whatsit is supposed to 
indicate, ^ • " ' > ' ^ - - 



Consider ^Is example: v 



• *P1ara a chsdi carl: beside the part rm&^rs of tJie parts needed 

to. replace the brush assemblies on the 45 Jaf gsDsrator, • ^ - 

Jtoie that ^e indicator is perfectljr cl^r birt> that t5}e iiafn intent is not 
radily apparent. The rain could include any of the fblloinngl 

• Be able to. select the correct par£s for replacing generator isrushes- 

• Be able to correctly read and \n\^r^^ a list of part nunfoers. 
f ^ able to fill out a r^est for r^lacesnent parts. 




• Be able to sort parts needed for one r^ir task from parts n^ed 
for amrtber repair task. * * • 

So you really don't kjnow what the indicatoi; is supposed to indicate. 

look at this example: - ' . 




• "Desaonstrate an understanding of good briefing skills by listing 
^ th€ tiiree sfain parts of ^ briefing.-. 

Here the indicator is clear; it calls for an observable act— listing. kxA 
it inight sound likg the main intent is clear. But is it really? Does 
"listing the three main parts, of a brief ing** jdegonstrate an understanding 
of good briefing skills? Listing the cain parts of a briefing only indicates 
an individuates knowledge of such parts, not his ability to conduct^a 
successful briefing nor even to recognize whether a particular briefing 
is organised in three- parts-^ Although *the isal^ intent is sfetted, it is 
'not clear^ In any case, the indicator doesn't*eVen seeci to be in the sasie 
ballpark* The point is that you don't really. know what the inain intent 
is^' and J3>e indicator doesn^t give you any help in interpreting it. Maybe 
the indicator is the p^rfo'nnance. that the person vrfio wrote the objective 
^nts ineasured and the inain intent was just poorly stated. Or perhaps 
the indicator is poor and the main \n\j^t should be clarified and supported 
i)y a different Indicator. - - - . * 

■ ' ■ - / ' -■' / ■• 

Jn summary, the performance statements for any objectives irom which 
you have tcr^e^lop a test must contain clear jiain intents* If the intent 
'«Tls,for a *r^ofmance that, is not directly, observable, an appropriate 
indfcator misabe provided. Mherj.you cannot^be sure what thje main intent 
of an dbjectiveSis, It must be revised,, cl an fied^wd-apfirpved' before yofi* 
begin the next; fwck. - ' ' / 




.1- 



Easiunng Tfeat ^erfoniBisce Indjattors .IRra Stnple, " 
■gfrect^* and Part of tne Tfefnees^ Hgertoire ' , ^ 

of .Eetevjor ' ' V"^ . ' . ' 



Fioure 2-3 shows tJiat if tfi^^cain int^t Df t^je objective is clear, 
yaa inast next ask wether it is overt or covert. M overt aiain intent is 
one which' is observable and Jieasurafale.^ In the preceding section^ the 
exainples of ''cross a wine obstacle" i^^*"unlock ^e .security container! • 
i^re, overt inain intents. O^^rt min Intents do not i^ulre indicators: 
They* already tel^ you what perforsnance is called for and bow to aieasure it. . 

• Covert 3?ain intents require Indfcatprs since the perfonnances they 
require are not directly observable. "A covert irain intent tells you the 
unoSservable perfonrance which the objective is about^ while its indicator 
telil you how to sieasure whether or not an individual can perform it. 

If your objective's main infent is li^asur^ through an indicator, you 
'sfJodid make sure that the indicator is apprt^riate. A good indicator 'is: 



• Sirple .' That is, it 'is as imcon55Tlcated as possible^ You dpn^t 
want the sain intent obscureSi by an^ unnecessarily complicated 
indicator. . ^: * « 

♦ Direct . Indicators are used Khen the perfonaance called fo/.by 
the cjain Intait of the fjerfornance statement is either not — 

^ direptly observable, or not practical in thne. testing sifeiatfon. 
But the ipdicator sfioiiid be as straightforvard as po^ib\^.-. it * 
should all^w you to determine whether or not the.jnain ih&nt 
has been satisfied without your having 'to go through chains of 
inference ' * 




♦ Part ;0f Uife trainees' nunhal j:^ertoire of-hdiavior. The trainee 



should be able to perfona the indicator behavior: The indicator . 
be^iavior itself is not what you want tD train or test.' You pnly 
use^it ^ a fneasure of the in^n fntent. So it is iinportant that 
. tjje indicator is simpler than the xnain; iritent and that the trainee , 
can do it. If ihe indicator yfere not a part of the trainee's nbr^ 
mal -repertoire, y6u wpuld be ineasuring two JJiings-^perfornjance on' 
' * the indicator '?ind perfjonaance ^n* the luain intents '* ^ ' \ ' 

<^ ' ' 1*. . \ ^ ' " . 

Let's analjrze-sc^^ exanples of indicators to see if they are as siinpl6 
and direct as passible, ^nd part of the nonnal -pefJertdire. fere's the ' ' ' 
fiJ^t exan?3le: , * . r/- - • Jl * * ^ • ' - 4- 

''Sboifltha* you can recognize the cjaitir 5orf«^%f .the hiraan. 
skeletal systgb-by drawing a -picture of eac-h bone bfeside the 
h^iises cf the bones provided, on^ nicjeogfaphed handout*" » 
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eiavi »5^5n*i2fog Ikjuss Is ^ aaln'lBtsnt, nihils dlrsmas ^ictaras or 
bcnes is^ow yo? indicate rsa^aiticn. Draidfig^ictar^ cf ^ie?jis a 
dirsd: usdicatcr in this <ase, since if a i^ca can dran* tfce ccrr^ 
' pict2zra next tS l?5e nane of a h^nsi^^u kno» fee can rscogni^e *3ie bcn^ 
vDjj don't feve to 3ia>;e aiiy info-^^..- Sat dram 03 a picture is nj^^^i' 
-fe sirapie indicator. «orse y£ti dr^'og a fpns well emwsli so 
. exsroiner could identify'it is £)ot a part of the traioees" nor^l repertoij^^. 
*-iailes tfce traiinas happen to be skilled, illustrators, ^^-^i, ? ^-[^ - - ^ 
. ccaild fan satisfy the objective ba:ause ba can't draw well , nol becaise 
'*l3e-tan*t recognize tb>- bone. • . 

• . _ - 

» •■ . ' - • 

"*** |« fact, tbe indi6Sltor is a pocr cne ficr another r^scn? Tbe siBiifr- 
intent is to recognize iones but the indicator reqaires the person to 
recall nfnat it looks 1ik€f then draw it. . 



w>- ' A Sjetter indicator fbr this main intent would b$ *. . -by writing the 
-isiwe^f Sietone next to t\i? picture of tfes Sbhe" cnv better yet, . .bjr^ 
droosing tJ^s correct name from tjss list proyided and writing it next to the 
pfct^re of the k)ne-" {The pictures of the bones are^^g^ided on a ^ 
jaiineograpbed handout..) ' - ' " 



Jtow consider tJjis exarool^: .•• , 



*"B? able to recognize properl-y filled-out and *in7>roper.ly cbiaD3eted 
orders-* Show your ability to do this by writing exarooTes^of.^ch.^ 



Ihe indicator is "by^^sflting exaiaples of each-" This fndicator appears to 
be nfei^r siisglg nor direct. T1?e perfonance called for is a cinnolex one-^ • 
wr1ting"orders~and yoi^ would have to infer that an individual could reojg-^ 
nize properly and i3jrpi*operTy filled ^jut orders based on. has ability to 
write examples 0^ -eaSi. ' In addition, the indicator' behavior required app^rs 
to^e inore difficult than the behavior' that-tne cain intenfeis concerned 
witb^-the abilkSty to discriminate between properly filled. orders, and 
those Whici) have not been , properly completed. Thus, the indicator -is less 
likely ta-ba- a" part of the individual's repertoire than'jj^e cain intend; 
this is .exactly the opposite of the way things should be. 
^ A ,1- ' ' ^ . * ■ 

' • ' ' ' ■ " - • «^ ^ 

- A better indicator woold be . .by sorting examples of orders iatcft^ 

two piles—those that are properly fill^ out,and th^se ^at al-en*t.* .In 
"this case^^W"^ individual has to do is sort doctiD^ts-—^ ^i^le and. . ^ 

direct indicator of ability .to recognize groper and irpro^ doocsenfe^ 

This in3ic3tbr would also be well with'in tJie noizjal^ behavioral feoertoires 
•of noat^trainees. ' y ■ ■ v 



Ifl sixOiaiy^ if tse Jiaid Intent of ah objeojnva Is cdi^^ — not direct}^ 
inea^ixrable (fbr whatever reasonh-yo:! ^mjU dbsck to be sure tbalan 



!?2haYicr„i^ich tfe trainee is ^ily^able tx) 

If i^icat&rs are not adequate— becaike aresrfK?!. sjinple cr dir^ 
8J5i^h, or not a fart of tse^trair^e^ poraal r^>ertoire cf behavior — or^ 
-if necessary indicators are Mssirg, yfej anay (modify ihe irdicatcrs cr 
crste new ones. Be sure, tJiough, to have 5*sn approved by the obj^fve 
writer- If you don't feel ycsi can properly iiodify or df^te a new mtiicator, 
' youjSboJld reguest iinprov^ indicators. Khen tfes oec^ary indicators ^re 
- revisei^ and' approved, proce^ with thejj^final cJjeci ob 4*5e adequacy of ypjr 
c^Jectives. * ^ ^ • . 



- Checfeii^g T^at' Perfonnances, Conditions, gftd Standards 
, are Specified In Precise^ Opera tional Tenns 

. Tiie third check you should mk^ foe* an objective is to ensure that 
tbe ^tat^ents of perfornarrce, condititms a/)d starxiards are vrittsj in 
jJreci^e, operational tenns, TMs Jtians ttet each" statement should be 
^sily translatable into actions- Toy have essentially already done this 
*$or the statetnent of*perfpT^nce'by checkii^q for clarity of the nain 
.intent anii appwgwatene^^of the indicator. 'A furU>er check on the 
perfoniance' statement o^your Dbjeetfve will be belpful at i^nis pointy 
thoygh. « ' ^ • 

!4ake sur^ jthat the statsEent or perfbrjnance uses a specific, action 
yeri) and you've about -won t32e >att1e^ Figure 2-4^ shows exainples of verbs 
often founa in th§ perfornaoc^^tateirente pf objecjives- " The left half 
of Figure 2-4 shows examples of noo-action verte wlsich- generally are not 
suitable fbi; perftmnanGe.^tatesnents. The riqht half shows exanioles of 
action verbs which jsgybe suitabrel Of course, it is iiroossibia to list 
all appropriate action verbs or all inappropriarjie non-actioi3 ^erbs- 
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fig:;re 2-4- £>aiTples of yerbs Often Used to 

Spaclfy Perfomarice in Objectifies- 
{Only those on the rigfit are 
really s^iita^le.) 



SDjnetijres vnat sounds like ai3 action verb-aiay not be suitably 47) a 
particular context, and what appears to be a non-y^ctiorv* verb jiay €esi<>nate 
o^er/able actions. So, use Figure 2-4 siuii^y as ej^nnsies of non-action 
^aj3d action yeris* If tbe,verb in a perft > n m de statSD^t is nore like 
'those "cn the left sid^ tfean tJiose on the ri^t side 'of -^e Fiijure, the * 
performance is probably not stated in iernts precise or oi^rational enough 
for you to ti^e. But alWays examine ^the verb \n tbe contact of the state- 
jient of perfomance and detennine if it is as specific 'an action verb as 
possible. ^ . ' 



Statements of conditions and standards inust also be written in precise^" 
operational! tenns-^ If tftey ar'e not, you will •oDt have enooab infbnngtion 
to build an ^A^^zii^ test. Figure 2-5 shovirs examples ^f stat^nents of con-* 
ditions and standards, soise of whids are specifi^ in precise, operational 
t^ras, and sooe of whidi are not. T?>e raltnnn on.thg^ left shows Khat tJse 
standard^ on conditions fre supposed to say in certain bbj^$.iyes. The - . 
'right colum shows horf sach meanings cou3d be incompletely- or. iricorrectly 
sp^ified. Jtote that ^projJerly specified statenents cf conditions will tell 
ypii all you need- to know in order to set ujJ the apignopriate i:onditionS for 
a test.. Standards inust tell, you as precisely as possible bow.^e individual 
irilT be scored- -about J^w is not good enjiugh. You, the iten^ wnter inust 
actaatly deteraine how-to m-sply with the standardf when you m^te an itea. 
for exaisple, if the objective calls for 302 acair^y jrou must decide yiheth^ 
this jseahs 4 out of 5, 8 out of 10, 16 pu^of 20, etc.— based apoil yourr * 
assessn^ts of the r^irernSnts of tJie situation, ar^ *of the re sources 
available. Alsp ncrte that, at first glanc^/ sone of the poorly "specified 
conditions . . 
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is Istended To State: 



•Sivea a 45 ^ generatcr via 
a fcix-ifin siaft bearing. . , 

♦ -un^ drdi^afy field, crn- 
di tiers in dayii^t * 

5 . - ,asing a gmltiireter and 
" sigial t§eneral2H" only 

♦ . . .Kiihout getting glue oa 
fee 3TDV2i)le Si;rfeces 



.Tibs3 ibis is An inpj^erly 
Sjfecifi^ Stataient: 



♦ Siven a iialf^rtioning 
generatcr- * . 

/# 

♦ . . .»jnder cj^naiy 
cordftiaTS*- 

♦ - . .using ^propri^ 
test e^uipnent 

•1 - .talcio^^^rgper . 



1 

7 



SO 



• . . .following fee sequent^ 
specified in the field Artillery 
Rocket Cr^wna^'s SKftST a>DO}c 

*iJsii}g a ID-*" slide rule» ^nuTtiply 
five-digit, twoHJedual plac^ 
aj3d MTite fee answer to 
rest tent?j 




♦ . . .^iog at least 6D wrds 
per m;iiite' corrected ^of errors 

f . . J&i steak should l>e li^ 
to imedd^ pink io ^e sniddle 



.fbllcaririg tie best 
seqaencs 



• t^ing^ sl'ide T^le, Jiulti- 

ply twD f^Ve-digit, t^o- 

deciiial place nunbers 'and 

regard t&g'cx?rrect*an^er 

I — ' — 

• . - .;«^fn^ at a quick • 
rate 

• . .the steak sboald4>a 
of an acc^febl^ a)lor in 
tl^ ^i^dle 



Figijre 2-5-. Exanples of Statements of Cc^ditioas^and Standanis 



• You shoQid ^k yourself s *£>oes It really tell jne'all I j:eed to knoK to 
establish proper conditions or pr^?er stan&rds, or Kill I hav^e to supply ^ 
informtion on -standards and a>ndiiions nysel f?'^.Jf4^ur answQ^^s lhat 
^*11 bavato suppiy in?oniatioa or fill in deiailSKetc., theft^tbe cbn- - 
ditions anj! standards are mt specifietJ in. precise, opfe^tlonal Jterns and ^ 



you won't' be a6le to use tfts?.. if you. tried to tse 
of gofng trough a fbt of effort iending iip 



aem, nxfir tiie risk 
a tiseless- jfest^ . -r*^ 



{{ote ifett appjttprj^i|e oDndltions and stan^r^ are ^ften #^1ated io . , 
tj)e level of your cbjectiya^ihat Is, at 4rhat leyel the cbSectUe specifies 
perforcance. Tor exan^Te^ a Level One objectiye anay be .to repair an^ . ' 
nalfuncfioning gsierator. In tbfs case "giv^ a snal functioning generator^ 
is an appropjifat^ statement 6f conditiions- it lipi^eyer,^ Hie c6iec*tiye is. 
to repair a ^ generator mth a brok^ s^aft bearing^— ^ zalfimctioniiig 
generator cay not confop to these regtrireDerits, and tliere^re *fouid 
an inapprcpriate cowiition. ' % , - 4. ..; • . 



' 1&23 yqrl rsvasK obiectiv^^-af ytyj fiad soire tmt do hDt lave tasls, 
CDDdltions, cr standards specified in cjeratioBal , precise terms, ypii s?K)a7d 

groseed test develogngnt activities, Irtsteady seod ^rii inati^ja« 
cibjective its crioSratcr- Ycci sJiDuld attach 3 s&eet to earii 

Jnadequate dbjeciive spelling ci;jt^?Bt is a&*rong idth it^ and /du cannDt 
develop a fest-fbr it izntil yo'j receive clariTication. (Ss sure you are 
not-nit-picticg and tiat tie objective rea3I^ doesn't give yoj ^jgh in- 
feniaticjn-)^ tcren, *Qit unt'il yDi? r^ive S!;ch clarificaticn befcre yixi. 
begin the next step c)f ;test developnent. , . 



Let *us review what yau. have done so far. Vp to this point -you have 
examined the three parts of your input' objectives, Jisde sure tbat all ob- 
jectives are tzni tary, ensured that/tJ^eir nain intents are clear and that ' 
appropriate indicators are as^ When riecessary, and have ch^Ked to see 
ttet all parts of the objectives are specif i^ in precise, operational terns* 
IChenever a cfjecK has r^ealed that an objective is inadequate you have ^ 
either jrodified it and sent it baci for approval, or docuinent^ the prt^lm- 
and sent it back for revision. Objectives ^ have been considered inappro- 
priate for one or uiore of the followir/g reasons: " * \ 

• One or jDore of the objective's three parts were snissiirtg ^ 

• An-ebjective ojvered more tMh one serarate task 

f • . • " ■ - . ' * 
. « * * " 

• fefii iftient was 

• Indicator was improper , • * 

• Pe r jbnrances I conditions or standards were not specified in 
^precise, operational t^rms. • ^ ^ / 



.CHAPIiH 3 
iOEVaOPlks A TEST PiAN ^ 



Jfow tfeit you've assessed t3je adequacy of fee objectives on idiidi your 
(S!7 irill be based, and irodified tJjera as necessary, you are rsdy to plas 
tne test itself- Develcpijng a test plan is an fmjxjrtant step in CRT ccn- 
structicn,. In tiiis st«), ysu coiisicer factors mi€b idll eiable you to 
constanact test Itens feised tipQH objective. 



Figups 3-1 , which fd\6s out at the end of feis cJjapter, shows tJje 
sequence of operations, involved in developing a test plan. Plese fold 
out figure 3-1 . First', you ^examine practical constraints, such as tiine 
and canpower airatilability, to detemice if tijey. affect how the objectives 
are to be tested. Thfen, if sudi asnstralnts are problens, ^u mast decide 
hoK to proceed— either by developing jLjjJajljfor selecting ^ng objectives 
or, .if tfat is not worJable, by nodiiying objectives. Kext, you- plan the 
type of itens Citeia fonnat) to use in fee test, and their level of fidelity 
U^lisni}.* Thai, if necessary, you develop plans for item sanpling and 
for sai^ling airong conditions. ' Firally, decide how cany itens should be 
included on the test, and document the entire test plan. .Itou then can use 
this plan to guide you in conita^ctij^g a pool of iteras— Which is ciDvered 
in fee hext'chapter. ' , ^ ~ 



EXAMIMIfiS PRACTICAL CONSTHAIWfs 



Itow that you have checked your objectives closely to cate sure they 
^re adequate, you mi , i kM B ulne th m-tS TSe mdJTGi^ are acMTTy- aMnis-T 
traHe. To do this you need to take into account several different types 
of practical constraints by ga^ering as ciudj infoniation as possible on 
test adniinistration and training conditiojis. Practical constraints include: 

• Tine ayailabilitjr , 

• tenpower availability • o . 
•Costs- * . . ' ' 
•■Equipoent and faci?1 ty availability , , ■ - ^ 

• Degree of rralisg in training and degree of realisa required in 
testing • 

• And othere 
• ■• - 3-1 



Ifo^tlBt ttj^e types of ccpstrai*nts are all intarraiat&5. . .for exannple, 
tliiie availaiiility, mr.p3mr &vzUability» eguipiient availability, and 
costs -ffirS oftiS) all .different aspects «f tBe saaie prcbleni. * 




liie first type of practical constraint, tine availability, is ^ily 
cnderstoDd. Oftea as situation is ajch.that It is iinpractical to test 
t&s objective as it is $tate3 in the available tiine^ Perhaps the objective 
is ^'Jfeixh 25 jpiles^ through mtshy terrain djsring inclenient wither conditions 
in 12 tears* or *tetdj a radar sojpe ftn* enaiy blips for 14 hours, inaic- 
tainiig proper vigilance as indicated by detecting the three siinulated 
enenjyi)lips pressjted during the ititerval.* ^rtfcof these exainples *^oald 
taite 3ioch too long to test practicably in jrost sit^tions. Ihese objectives 
my iSave to be 3na§ified to pennit i^ting in less tine* In gerSa-al, tiine 
liinits iuust be plac^i on test adisinistTation, which in-faxra nay liinit the 
objectives being tested. Sonjetiiues there are severaf objectives whidi, 
if tested, would taJrj* luore tiae than is available, in such cases, it inay 
be possible to selecr ainong these objective Kitho^t having $o ;nodify thera^ 



ISanootfer 



JfanpowBT avail^ility can also iinpose practical constraints. For ' . 
^auple, if under nomal conditions, it takes ,4 inen to operate a uain 
battle tank—a c^nrander, driver, g|aner, and a crewnan/16ader~and yoi^ 
Kant to test a class of assistant cfewnan/loaders under nonial operatioiial 
conditions, then personnel trained in the . functions of conuander, gunner,, 
and driver will 1>e required for the test* If these personnel ^re not;, 
available;, then there is insufficient jianpower available for conducting 
the assistant crewnan/gunner test under nornal operating conditions. 

Often manpower constraints are severe Khen jonly a few testtaSainistratdrs 
wiir be avaiUble, y^ Jiany trainees will have to be tested concurrently* 
Tor exanple> 20 soldiers iray be tested simultaneously in basic first aid 
procedures by only two adininistrators >^k) must Uhis try to monitorTthe 
perfDnnaCDce of ten individuals apiece. There are mai^ inslainces ih Khich 
an objective appears to call for. more manpwer than is available/' In. such 
instances you may wish to select among objectives,. so that enough manpowijer 
can be available for fee testing, or tolnodify objectives so that less / 
manj>bwer is required. ^ \ 



Costs- - . 

^^^^^^^^ , >' • * " m 

- ^ ^ '/-^ • - ^ , 

Cost is an iiiigo r 't a ot factor in deyalos^og dSs/ iSe cosTof^ts^'^ 
admiflfstration anust 1^ ^pt witMs liialts dictated try ^ testing 
Ixid^etigf t2;e facili^ «ierB thj ,test idll 5» Qsed* ?tjr €>siEp1e, it i^uld 
be entljpslj too cosiljf (and cmresojafale) to have a denolition-^pecfalist 
trainee blow up a bridge to test his ability to achieve i^iirnna d^pe. 
Ihere iiust be otJier more practical ineaiis of testing this objective* if 
the acfeial objective specifies desuolisfrii^ a bridge, it Jiay ifell teve to 
be JiodifiM so that the bridge is not actoally deciolished, but the trainees 
,dsnorfetjate; the processes l^irg up .to denolition. . * . ' 

- * • 

if the cost of testing all objectives is i>rohibitive, and if selecting 
ainong objectives is feasible, then the best alternative siay be to test a 
subset of thesB* . . ^ 



Facilities/EguioBent " * . 

/ - - ' 

Often the sitoatiort is such, that equipRjsnt and/or facilities are not 
available for test administration. Tins is especially true for soohisti- . 
cated'iegidpcaent and yej:y specialized facilities, tar .example, hem can a 
trainee demonstrate cag>eteDce in escape and evasion in a tropical jungle, 
Khan the testing snust take place in the Southwestern United States?. An' 
extrene example of a facility-caased constraint ijay be firiftg a urissile 
down range. At inany tost sites it is iispossible to obtain a ranjge tfat 
is long, enough* * - . - - " - / ' , " ; - 

An example of a practical .constraint concerning eguipEsent ayailabiTity 
liright involve a qourrse on troubleshooting a terrain-following radar system. 
The perfornance objective nay include "plinting a bug in the system and 
having trainees locate the problem aiid replace or repair the necessary |3^rts. 
However, this radar system is sufficijently ci3mplex and costly that it is. 
not made available /or training purpose^ and dierefore prohibits testing 
on the actual equipment. In this C3se/ ei^lpment-availabili^ is a vei^ . , 
severe practical constraint. Anpttier example troubleshooting a computer: , 
The'ddwitime of the computer inay Jbe so c(^3y ,as tfit negate its. use for 
training. purposes. ' / f ^ \ 1, . 



If you have jnany objectives )^ich would tax facititi^/equipment it*"^ 
bej^ond fBdiSfble limits, it liay be possible to select among then rather 
than to laodi^ the ebjectivesv ^ 



Cegree of Eeallsni 



r ^tfeer isnportajit practical coristraint that ray iinpact on CRT devel- 
opsnent is esl^^sloent df an acceptable degree of raalisia in training 
ana t^ti^f^t^'^^Snsi^er ttaining in first aid: In aliaost ^11 cases of 
teaching first aid for an open leg. wound, a patieut with such*a w?uj^ is 
rot avail^iSyg^^ai for observation, let alone practice. A suitable soh- 
stiiiite pjst be made nere, there^ decreasing the degree of realisia,_ Another 
such case^ Just as.obviays, is ia^ining disamaDent of live inines. the 
njinss, of course, 1it li aiiihiy <iie ' h ey iHr liyc; Ui erefore, the training 
conditions are mot very close to the real situations 



A high d^i^e of ^ealisia in testirg is ^Iso similarly diffioilt to 
provide. In testing basic iiarching itanaivers associated with the drill 
and cereniony toapobent of basic training, a f^rade fiel^ is jsecessary* 
The degree of realisia in testing decreases as the diiaeteions of the testing 
field differ^ fnxn a standard parade-size field, ftovr real are the testing 
conditions if a 40-ft field is being used? Another^exainple involves 
testing trainees op detecting and challenging intruders.. How real is a 
testing situation Khere the test administrator Jumps out at a trainee " 
while the oilier trainees vpait within hiring distance for their turn? the 
degree^ of reali^ sSbuld not differ froia training to-testii^. ^ . 



There are other practical constraints wfiich*" jrpu my encounter in the 
deyaloixnent of your test; however, this section covered the iDost 
c^son ones. Less cocraon types of practical constraints include: 

• Logistics ' . . - " . 
.•Supervisory effectiveness 

• dKiiTwni cations 

- •Ethical considerations 

• Legal considerations 



* ReJDenber that |n cipst cases constraints are interreTatetf. As ybuUl 
recall, the prac^ic^l '.con%1araint in the exainple of ^he-terr?tin-fpllowi^ 
radar system was categorized under equipment availabilitj% This constraint 
could iiTso be categojfizeid under costs j^. Another instanced of interrelatibji 

|n the exaEiple of tte 40-f$:. field being used for testing basic sardiing 
ffi^iieuyers. Hot only the degree of realism low, as indicated in the 
exanple, but the objednve was liaited by facility availability. 



'ft)tential Sources of ilata 



'infoniatipn on practical constraints can be obtained from a variety 
of sources^ One source is curresit documentatioB on test admihistralicn 
aid trainipg conditions (such as Amy Field }famal 21-6, IRSDO: £^ 350-l£»-l 
T5JAD0C PaiB 6GB-11, etc.). These documents are good sources for^current^ 
procedures in this area, but laore direct sources of infoniation on traimng/ 
testing situations at specific locations are preferable. Sudi direct 
sources include personal experiaice and observations, and the oiiservations 
,of yi>ur associates^ especially those who have given siinilar tests before 
at the sane place. The best single source <?f ipfomation on practical con- 
straints at a particular site, is a visit to that site. If^ssible, you 
should arrange to go to the site and observe first-innd the availability 
of facilities, equipinent, and canpower. 4(hile tJtvare you should talk with 
personnel ivho conduct training and testing to find out anore about time 
availability and budgeting considerations at ^e site. ^ 



GGier^Otfrces of data nay also be available to you. Use'your discre- 
* tion and ^ure that this infoniBt|on fs accurate. 



Assessing Practical tonstraints 



After you have .Identified practical ojnstralrits, you must detemloe 
whether they afe severe enough to prohibit testing all objectives as 
stated. As you have probably noticed, some constraints may be very s;trpng, 
while others are relatively unimportant- Each niust be considered carefiilly. 
Sojne constraints my be so severe that they necessitate ao^ification of the 
objectives, or selecting ajnong objectives, vSiereas other constraints may"" 
be easily overcc^. 

- - - 

As you can see frora Figure 3-1 , if practical constraints do not con- 
strain testing x)f all objectives as they are stated, there is no need to 
either select ajaohg objectives' or modiiSr obj^tives. 

However, If practical constraints preyent testing of^all objectives — 
as- stated, you will have to select acong objectives or inodvfy objectives, 
first detenaine whether it is feasible to select aaong objectives. It often 
is- feasible,' unless objectives concern critical taste.. ' '. 

- When your objectives concern critical tasks, Jrou should pi^bably' 
not .select aaorg thea. that is, if .jrfisperfornarice could lead to 
loss of life, property, gr nission failtjre, ^u should be sure , 
■tfat ever7one can meet every objective-.' ~- ■ 



' Then deteroine if selecting aaong obj^tives will overcpeier practical 
constraints- Soaetloes selection won't overcpsne practical constraints 



since it is possible that any one objective, as stated, W3u1d overtax 
^resowrces. S09 before deciding t(^ select 'aitbng objectives^ mke ^xre 
. thatitoihg so idll solve tre constraints problera, " 



Selecting Anohg Cbjectives 

If it is-feasible to select among objectives, and doing so vrill over- 
come practical o^nstraints, then," instead of jnodifying obja:tives~vhicb . 
ninj the risK of distorting their original intent-ryou include, objectives 
as originally stated, by selecting among thesa. Don't infoV^ trainees 
¥Mch objectives you intend to ^test, ho^er. If trainees toon they snay 
be tested on ^ny objective, but don't know which, they CRist prepare for 
all of thsa* l^t's look at an example^ 

# "* " 

Suppose we are defveloping a CRT to use in evaluating pie^k^'ng- 
ability in a food service course^ Assume that there are 10 ^testable 
objectives* Each involves being able to bake a pie vHich is rated as 
adequate by th'ree independent judges. The following 10 pies are taught: 

• Apple pie ^ . ^ \ 

• Cberry pie ^ ■ * - — 

• Peach pie . * 

• Blueberry pie ^ , ' 

•^Coconut -cream- pie ' - * ' " " 

• Pecan pie - 

• Raisin pie - ^ ^. ^ 7' 

• Black raspberry pie ' ^ 
■ * •Banana creanj pie 

•L$mon ipenngne pie > . , 



^Jtow, assume that the training lasts 10 hours (1 hour peH.pie) and 
^at*1^ students are to be testers We have two hours available for our 
end-of-unit CRT^ It is prcihibitively e)q)ensive to provide sufficient 
ingredients for each sttident to bake each pie. Here isr^a case,)ri^ere we 
Eright legitimately select among objectives in developing CKfs, rather th^ 
testing on each individual objective. This, trainees Sight be testes og' , 
.th^fr ability to prepare only two pies {one fruit .type and one cream t^). 
This is an example of "'stratified'' selection among objectives* ^ One objecti 
is selected froia each tJf two strata. If all pies Vfere of, the sarie %pe^ 
tftere would be no strata, and any objective could be randoaly sel«:^. 



r^iraan was to be tested on iris ability 



on repairing at l^.t one radio^yOne oscilloscope, and one signal generator. 

in any case, ft is^ij^portant that. the: trainees not toow which partioilar 
objectives (v.1i1ch pie, uiiidi r^dio, etc.) they tiriJl be tes^ "on. They 
cKist be responsible for all objectives* 

* Two iinportant aspect? of selecting among" obJ«:tiv6s in CRT development 
ape indicated in Figure • . • - . 



— ^ — : 

Khen selecting anKwg objectives in CRT development be sure that: 

• The objective or objectives to be tested ^re 
chosen at randoft fronj the entire population 
of objectives available for testing 

• The Students to be tested are not infohned 
of the "sainple of itens selected for toting 



figure 3-2: Guideline for Selecting AiDorig Objectives in 
CRT Development _ ' : 



P^&tesiber^ if you select aming objectives, you can only g^raiftee that 
trainees can perfoiis objectives Ion vftich tibey were tested (and passed) 
You c^n also docunient the testing procedure to infoiti people that trainees 
were r^esponsibSfeT^r a.^ d)Jec«ves, did not know which they would be 'tested 
on, and jhad' an equal^cJian^J^be tested on ar^y objective (since you select- 
ed at randoa from ^jnong tne objectives). As noted, this is not appropriate 
^for critical objectives, T)ut it will be satisfactory for many others. 

Docmaent your plan for selecting anion^ objectives so that yoii will 
iiavp'j:»^^ord nf hjy^ tn do it vhen you build yojur test. "Poctraentation 
laight sicipTy-^ay:. "Select randoaly any two of. the five objectives," or , 
{as ia the case of "the pie-jniking ^exanple), "Select any oo^ fruit pie 
randomly, and any. one jcreaa pie randoraly." ^ • 



Hodiiying Objectives in- Light of Practical Constraints- 



ERIC 



_ll_„ijijisht of tte constraints found, qfaiectlves ray have to be oodified* 
^oTiSider the three parts of objectives dfscussed earlier: perforajances, 
s'tandapds and cendftions. Perfornances should not be-ujodified unless aBso- 
tutel^ necessary. .Standards', on the other hantCraay be mpdffied. Tor 
ek<ppleiyou my fmve to lengthen or shorten tirne Tinrits for testihg. In 
oany cases you will fiiKi it necessary to eisMfy conditions, such as settings, 
locations, etc. Assess eSch constraint .'separately ar>a modify th? objective 
as^ feqofBW. Modify as little as possible to isalce ^e 'objective acceptable 
antf ac(5iirate, but still appcppriate for testing. ' ' . 



•3-7 



Jtow let look at an e>sii5)le of a siiaiation in ii^ricJi you wxild have 
to rodii^ an o'bjective becausB of practical cx^nstraiite in the training/ 
testing situation. Kara is the objective: 

"Given a oroplete field kitchen set-jip, the basic cook trainee Win 
prepare a standard dinner meal for 250 persons ander\actical foncard area 
laess.cowJUIons- Tne meal raist be prejjar^ within 3 hours, and the student 
iffiist follow hygienic regulations as specffied 1n the POI for'^ic-Coak* 
The trainee irill have a food service apprentice under his supervision^. 
. R)od^will have to be prepared vith a jninimma of noise, and light » and nor- 
3ial perimeter security regalalions must be observed^ The meal must be 
rated as satisiactory tJy three judges all %f Khom have held the J40S for 
fesic Cook 'for five ygarS and have been first c»ok format least three years 



You nake a site inspection of theH^acilities vhere the testing is to 
be conducted and find the following facts which you feel are potential 
p^ractical constraints: ^ - ' ' 

1. A test range eguivalent to a forward ar^ is not* available. ' 

2. An average of 14-16 men are trained at once for the basic cook 
HQS. Total test time available for^he field kitdien unit is 

12 hours and must include tests of setting up the field kitchen, t 
; * maint^ning equipment and prep|iring^rnir^ and afternoon meals*^ 

. 3. The training budget wilT not allow" for food for .feeding. -250 

people per test— food cannot be wasted. All food prepared must 
. be eaten according to the SOP at this facillt^^ ^ 

Three cooks^ each ynth three years experience as 'first cook, are 
not available for testing purposes,^ Only one such individual is 
available.. There are several other cooks avail able, "but none 
has served as first: cook. . ■ , , * 

* . . ' ' ■ 

5- JOnly three test adsiinistrators are available. . 
\ . , 
6. Only two' field kitchen set-ups--are available. 

' . . > - ■ ■ A ' ■ • 
Considering the alcove inforroatipn on practical constraints, it should be 
obvious ^at the objective jmist be modified before a test can be devfildped 
which ?#ill be suitable for.fhat facility. The tjuestion is* "how can the 
objective be inodified so as to not violate its intent?" Let's consider 
the tyiJes of (Constraints aJid. analyze bow they affect the objectiv,^. 

first, ' Mciliiy and^equiipent, constraints do not appear important: 
There are two field lei tchens available which should be afeple. Aithough 
there is no test range"sfmilar to ^ tactical forward arfea, such.an area . 
can be sinwlated.. The sinwlation can beinadeiDore realistic by playing 
^tape-recorded "field" sounds (artillery, fire bursts^ etc.), requiring' ' 



mmmi a)cl:iD§ sounds, and aiaij^iniog mnfutm li^ing.^ Ibe resultirrg 
loss in fidelity. snccjld not be critirai thi'^ situation^. 



Kar4)iMsr coiHtrafnts do apjJar seriosS:, tiitrJgti, on several oDunts- 




portion of tbe objective— one such cook is available to i>artici|)at8- 
The. ^Judges mhpoyBr constraint^ can be disposed of now: The -objective's 
specifications for Jadges are probably too rigorous- Thej! can relaxed 
Kftbout seriousljf^ affecting the intent of tbe objective (ineagaing the 
^preinees* abil4tj!vto prepare a .satisfactory iipeal)y Tfce c5)Jective coiild be 
easily iDodified' to read "...rated as satisfactory, bj ttir^e Judges currentljp 
holding. ^be f©S for basic coc\i and alt having at last six wortths> exparient^J 
Jkis is a mdti lower requirerasjt for the judges^r but should be appropriate 
and adequate for the test situation. 

Time constralnts;^rs quite s^^ere. Assuming tfet ths'othef tests 
Khich nrjst be given Tor t3ie field kitchen unit (setting-up, naifttenauce, ' 
etc.) wll rsguire two-tftinis of ths 12 hoars available, oniy four hours 
are a'i^ilaWjBfbr testing 14-1d anen— and each siuist jbe testaJ on his ability 
to prepare a Sal for 250 people within three hours-. Gbvicuslj', the tiine 
constraint? ire too severe to get arou/id by ^ying W stretc^j time avail- 
ability -for testing or by slightly lessening "the tiiine requii^nents stated 
in the jDbjective. But, since time constraints are interrelated with iran- 
power arvailability, ti^/ pan be overcome by ranipulating tne jianjWKsr, 

given the ta© f i^d kitchen setups availably, two grips' of trainees 
can be t ested at onc e. Althoug h the object iyg specified the'lrain^J>gljig, 



t^ted ynUi a food service apprentice to^eTp fiTni, it should hot alter the 
spirit of the objective to require the traic^e to serve eit3ier as super- 
vij|or or "as food service apprentice* If ytemdi^ the objective in light 
of th7s, w§ can jfeK test teams of two traijiees (one supervisor and one 
apprentice}~on^ at ^ch field kitchen sefejp- i 

KuSis the r^uiranent that a. ^1 he prepared for 230 troops is probably 
over-stringent. The trainee could just as easily denwnstrate-Jiis ability / 
jEo prepare ineals for large groups by pr^xaring a in^l for 100. troops. Tbis / 
shduld take only abodt laffl ho urs Instrad ^f tSiree. if Me.jnodify the objec7 
tive accordingly, we can now have two tears of two working concuirently at 
each field 4^tchen. Thus, 16- trainees can be tested in four^hours. 

. ^11 trainees can take. a brief written t^st on planning evening laeals - 
for 250 troops—quantities of supplies involved, s?:heduliBg, logistics, 
etc. —and on canaging food service assistants,. Jhus^ >fhether a trainee 
served eys cook (supervisor) or apprentice, he would be tested on plannins 
and managing prepargtioh iff an^ehing oeal fo^250. 



fiml^yi. tfcsrs is accost constraiet: food csonot be wasted. . Tot 
♦hot an iuportant joonstraiot, sines it cin be sasilj ovarcopB. A tota 
.of 800 troops coald be fed from &a JiBaJs produced by tSje eight groins 
traicees. I?3ese portiom oxild be served to-^rt»er troops cn field 
exercises^ in iha area, i^f scheduling i^er^ <x>DnJinated. 'Aifernatiyeljr^ 
the prepirad food could Ibe trucked to a jiess hall and served as the dinSer 

♦ 

s.^ It is beipfu] to mkB a table of th^ ccnditicns and stanilirds ift^rt \ 

omsfetive that regyires anodification ifl lignt of practical constyaiflts* 
¥ig]2ra 3-3 shoves suoj a fable Tilled So witi3 Tnfcnnation froin the food 
Service ex^le we hayet>een disGissing^ - tiote that tl5e table pr^^ts ♦ 
ii^e coiiditions and standards i^hidt res^Are, d^.ange, they r^uire change, 
ajid bow ti^ey sfetrjld be changeii^ ^ . 

Use of a tabular suncnary Sudi as Figure 3-3 vnM help ycro organize 
informticn on podifyigg obj^tives to o^'ercome practical p^nsti^flts, 
6y using a stnnnary table, you won't lose sight of tbe forest by concen- 
trating cn the trees* ' ^ . 

* « - . - 

Here is how the d>jective Jnig!K 'r^ad after Jiiodified by practical 
constraifjts: ' . 

*fiiven a cofnplete field ^itdsen set-up-, tJie basic " . = 
cook trainee will help prepare a standard roeal fo? 

100 troops un^er siinulated tactical forward area auess c^n- * 
ditions* . Tlie trainee ?nay serve as.cogjc or food service, 
apprentice- A team of one apfjrentice aiw3 cne cook wilT * ^ • 
-^^are the in?al within tWd haurs. The food inll bj5 pre- 



^ pared with a ininiimxni of noise and light and nomal perimeter 
s^c\xr\ty regulations will be obsjerved/r Proper hygienic* 
. \^ practices, as specified In the POl for Basic Cook, /will • ' 
be followed. The ineal must be rated as satisfact^rry by 
three Jydg^ currently holding the WJS for basic/a>ok and u 
all having at least six cjonths experienfe. In Addition, \ 
the meal ciust be suitable for consuniption, as /specified 
by standard food service regulations, since it isay be 
served to actual troops.*- ^ 



Submit Modified Objectives . . ' 

After modification,* send the objectivei back to tiaeir originator for 
approval before proceeding. Be sure to Include reasons for modification 
with the modified oljjectives. By doing this; you^make sure that the 
modifiiad objectives are suitable-jthat modification has not distorted the 
original intent of the objectiv^^ 



sic •/ - ; ' >io ; 





Tlhs% Cdirfitfin^ 
^ StatrianJs Rsgaiis 


abw tD Jfedify Corditions and 
Stsnriards so tijg: Ovefcome 
?r2ct;?cal ConsSraic'ts 


^S^* pspple ' 
.^Jiast be fed ' 

* 

... * 


only csok ibr a ' 
isaxiimrn of lOD^^^tJ^^le*, 

Planning a 5nea1 for JQ^. 
people is less inyo1ve:ii 

scbe±j1ing, assistance 
neguired, e^.— tJan 
planniog a ansal for 
259 p2opie^ 


lb 3iOdific3tio3, t^esaise i 
procedixres don't ciange ' 
significantly %&e!J ooiig 
from lOD lo people 

2- fake jeper asri pencil 
^sst: ^ticate anoimt 

for 25D ' 

♦ 

3- If),dicat5s^hOK assisiaiits 
nQpId be TOinaged 


• 3 Pilaster coDls eacii 


IferpDver availability, 
carrot get three 
afgnly experienced 
cooks 

# ' i 


Sjbstit^Jte less eiperieiced 
cooks to do t^e rojtine 
adjects of the Jydgiog- - 


^Sapenise one 

i: ^ ^ — 


Jfes?)Dwer availability 


Have one traifjee serve as 
an apprentice^ 


' ' ^iocatios f n for- 
area* ' 


^^^^ 

AvaiiabflitTy of 
eguipment facili- j 
ties: iForxtard tacti- 
cal ajtea not available | 


Sinulate Forward Tactical 
Area: 

1.* Play tape rea>nied 
''flefd*^ sounds:, 
artillery, etc. 

2* teintain iriniinuia. 
lighting, iiriniijal 
cooking soOTds ' 


*3 boar tiiDg * 
limit" 


Too Jiany trainees- 

to devote "3 hours 

to test eaca one J 

' . ' -i 


i: lest tw9 at a tine for 
about 2 hours sdi 
(feasible, if Jieai is 
for about 100 people) • 

2. Have erne trainee' serve 
" ^as""an apprentice 


Figure 3-3. Tabular fojstfor Sijarariring Conditions aorf Standai^ - 
that Hecuire Change in aa Objective and 1^0ir to Chaise 
T})eQ. (llith Sanplfi'Inforra^on frora Food Service 
Exacple) ^. 1 ' 
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Bsyfcre <x)ns tnzctiog your test iiaisVi^:^ irill fac3d :*ri4a ques^iOTS 

• Recall m^r^T^ . - - 

, #^£6 slnDlatltms? ' \ 

# Sizpeciiscr^r p§ar ratings? 

^^irfcaallj' ajgr of &2se fonrate can be adaptad to any testing situation* 
mere Jiay evea ^>e otSiers Siat ara rore a|:^rt)pri2te. KMdi ^oald yoa 
diooset .These ara tjuestions involvSng it^in fcrnat $nd test fidelity. 



first, let vs ^iscoss if&iat ve iiaan by the fel^n ''fidelfty.* Tfce term 
""test fidality^' addresses ihe e^ftent to^»&idi a (ST r^enol^ t^e actual 
objective (o.i! peii o (Uaance) being tested! li^e snore CRi resssibles the 
jperfornance in i^uestioa, the higher the fidelity of fee CST. It is. prob- 
ably obvioas to y^u iSiat this is one place Kbere practical testing conr 
siraints have a direct iiipasf^ on CST developnent* if, for exanple, it is 
tep c<&tly to use an actual airoaft for a naintenance t^t and jou roist 
feerefora use a sfurjla&r^^ycu lose fidelity-^-unless the sijirj1afe)r is 
very mtif like the actual airtaraft in tenns of required perfomanc^s. To 
tSe extent tJiat the p^^nrances required on the jsimulator approacJi those 
required on tfee actoal equipn^t, the fidelity loss is minimize. Some, 
siurjlators, however, 'cause, a great loss in fideliiy. For exanple, if the 
sijTnilator fs a series of Soma slides of an aziniu^ cursor and the per»r- 
naoce required of the trainee is to check which of four alternative slides 
is siiost like the required cwsor placement, the fidelity loss from an' • 
acl^i operatioiHl radar^^cope is dranatic* Cne useful test fidelity scale 
is shown in Figure 3-4. 
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K^ura Sinai ated Sevier 






" Jteisurs ^a^i life** Bebayior 


figare 3-4. 

« 


fidelity Levels a^ri 



* fiow that you have an id^ of *:hat Is ^neant by tJje term ^fidelity/ 
you can see ttet item fcntat anSJ test fidelity are closely related^ Prac- 
tical testing constraints my dictate the ase of a fbur-alteroative imil- 
tiple d^aice pa?er-and-pencil test, for ocample, becaiise socb tests are 
si:n?le to administer and easy fo sojre, altboagh the test fidelity my be 
low. ^ . 



Aj^ood guideline fbij item foi^iBt: is: 



Select ^e formt that best approxiirates t^>e b^vior specifies by 
the objective. ' . ' . - " 



If the instruction is aitcied at probleiD-solving, then tJje items should 
address problem-solving tasks and not, for exainple, Jtoowledge about the 
recpjired background content. If the instruction is intended to teach how ' 
to evaluate a particular perforaance, the itesns shoaid be afefUi evaluating 
that perfonnance, not^acta^ally doing:^at pei'foraairCe, ' ,^ 



Iteni fbniat and lest fidelity are difficult issues^ Follow the 
line in the a^ve to the extent .possible, TOnsisteot with practical con- 
straints- Use a foixat. which will penuit ihe highest level of fidelity 
practicable* - - ,^ 
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^Ica11y,'5i Is ^ier io dei'elcp high fUali^ CRTs fcr hard jkin 
sMsct scatter areas (sadi ^. electronic 3iakltss2BK sod .artillery r^rs 
dafs^pn cmputsr) tfcan ficr soft skill areas (sacfi. as leaderstiiD 3nd 
tactics). Hbas is because iand skill areas gensrall^ ioclade cbjecti'v^ 
idildi are srcis easily sjeciried io terns of concrete belaviors. 
' -• - ~ ■ . 

* » 

Ty;^ of Itens for toitten Tests 



StmB objectives can best tested by paper-and-i>Kicil items. Sych 
tests are i^ually prfot^ on a fcm mtb spaces for ans^i^rs. ?apa--and- 
paocU iteins -gre best suitgJ for evaluating kcowl^ge, ability to use 
infbnratffcn, froblenHSolvl^ and Kritten computations, iney are some- 
tines i2sed as. low fidelity masures cff hjndfe-orr perfornao^ skills • 



Kritten test ftans'^iiain advantage is that ii^ey can often be easily 
scored (indeed, in some cases they can be computer-scored) in ojntrast^to 
perfornance test items ^jchene scoring "depends on the test aMdrainlstrator's 
-obserwiicT^. Therefore, y^ritten items are often relatively reliable 
sneasurgs— that js, they .nieasure approxiitately the same thing each time 
they are administered, "perfonran^ test items, idiile often less reliable, 
are i;sually more demonstrably valid ineasizres—that is, they are more liicely 
to (measure wiat they are suppos©3 to sneasure. Written items should be 
used in penomance testing only Khen the perfoteance itself involves 
atfriting or Kben practical .constraints (such as tiane availability) prevent 
selecting anoj^g objectives. - . - 

Th^e are. several different types of formats ^yfnich are often used 
for Krii^ijiJ^t it€ms,^ificludingr » 

• Kulti pie-Choice ItCTS 

• Matching it^ 

• Completion Items 

• true-i^lse Items 

V 

- • Production Items • 

Haltiple-Choice iteas can be' adapted to ainost all types .of *n*itten 
testsi The staiidara best answer (i>ut not necessarily the only correct 

*' answer) is included in ihe test iten Itself. This type of iten is versa- . 
tile, can take a varied of different fonns and .can be osed to test differ- 

■ eht asp^s of knowlafge. 

-'. , " .60' ' ♦ 
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tetcgSng itens gs:arally esnplvy two colanns of eJ&n^t^* 1232 s4^ds^ 
is ^pically asijei to mtch cm eleraent from first list to the mst 
clospy rs1ate3 elemsst tJ)e secx>nd list- It is preferable to have 
different Ttjniiers .of element^ an a2)2 lists to dis^courase tSie t'^derrt from 
using. a process of elimination Kh^ gets down to the last»3ratc*i* 

. ' CoiHpletjcn itgns 3iay c^e in different forms: Ctoe being a qi;sstion 
that reguires a short -phrase answer and the other feavirg one or snore inter- 
nal blanks iiiat re|aire single arfords jor short phrases* Itaj should ose 
c?tre in isriting this second type of cqiipletion iteii>~too siany blanlis "irajy- 
snake a s^tsjce incxmtijr^jensible* ' ^ 



Tnze-false itgis have siany disadvantages: 



• Ka^ tini^ s^ich iteins are biiilt^ around sentent^ Vhi.ch are lifted 
ver&atiiB from training iraterisfts {penbaps onlj changing one word), 
^ich encourages uenorizaticn - 

• Often it Is dii^icult to determine whether items are tr^e ^r fatse 
•v4ien the, sentences are out of context. # , • ^ 

• fiigh scores can be obtained by mere gixyd^gue^^ilig^ since there are 
only two possible anst^rs. 



Jk good rule of thumb is bo avoid true-faTs^test items 'entir^y. 

' 7 

Production items {"essay** items or oral exams) sbould"also be avoided 
dt/e to their subjectivity* l^ere are nany ways. a student can express an 
^answer to this iypt of iten which aakes scoring e^Jtremely difficult. VCbat's 
worse^ an individual, tjho can express biniseif well in writing or orally has 
an edge over the individual who cannot, regardless of their relative achieve 
inent^on the" ^subject ratter^ - _ - - 

^ » 

Some general advantages of ysing wntteaJjests .include: 

. efesy 5nd relial)le administration.^ " ^ . 

, • Easy scoring by band or rachine. 

• Coverage of a laV^ quantity of njaterial in a relatively short 
^ar)ouht of tiine.. ^ * ^ - 

^ Easy iaainteiiance of efficiSrt records-^ ^ ^ 



ffojyeyeir, it is oftsn'terd to relate vntten t^ts to Jeb psrfsmance. in 
inany cases fee student my be able to jasa a i^Htt^ t^t on a ge r3i>r nr a nce 
and cot b§ ^le to attmlly parfcm the respired task. i?cr example,. if 
5n lodiyidijal jrould pass a itfritt^, analtip1e-<3io3ce test on bonfc disposal 
• prooeiifrK, worjld yD-n be willing to send hfjB o^rt to defuse aD actual, live 
imb?) IChefi sising an objective written test yoa should be certain tfcat 
""t3ie tesl^Jtaiss are Sijitable for assessing fee ac>3ievenent of ^e objective. 



Written l^sts are inost often appropriate for testing abstract . 
concepts and objectives u'bich require l:nowla5ge insted' of 
perfbrnance. * - . 



y 



Items For ^Perfbnnance Tests: Process and Prodi^ct Keasures 



^Perfornance tests require the student to p^rfonn an overt action or 
seri^7)f^3ctions , rather* than to verbalize or wite {iznless the required 
perfbrnance is speaking or wri^>ing). Figure 3-5 shows a oraparison be-. 
twee^J -perfomance test itssns -and writfealt^tjtens* 



'^JTT£« TEST iTB^S 


y ■ - * - ' 


c 1= 

9rxmT\Vi abstract or verbal- 

items addles knowledge ^nd ^ 
content. 

Itenis osually address inde- 
* pendent aspects.* 


c Primarily nonverbal. f ^ , 

Items are skills, perforsances or 
' ' Job related decisions.* ^ 

Itsns jiay be sequentially presented^ 
Errors early jn the sequence jay 
affect later itens- ^ ^ ^''"^ 


figure 3-5. Sons Conraon Differences Between Perfomsnce - 

Test I teas ansl Written Test Iteis ' 



In a perfonrance test, the student actually perfonas a task and is - 
judged against predeternnnKl criteria. A perfonnance test nay -involve 
product Ejeasurenient, process cieasureaent or both. Before consiB«-ing 
types of perforcance jteas, let's discuss the problea,of whether the iteas 
should cseasure processes or products. ' ♦ 



lo devetopii^ yosr test plaa you v^ll have to detendne v^f^BtbBr tba 
cajectives require Jieasar^nent of a product (iiiat is, sonetJifog is 
taiglble and .ifi&i<±i can be rsadSI:^ jieas:/rad as to its presents cr absence) * 
cr a process (for s>snip1e^ ths degree to i^^iici a student follows prove- 
diar^ correctly, reganll^s of tfee czit«)ire cf feis actfcns), " . - 
^ • » * - - 

Product jieasureueot 'is always' apprppriate if tbe objective specifies, 
a proi^^ if a^mdurt neasure is called for, it sfeoula be incjgporated 
iirto tJ>3 toBinfi^ objective aod it should be carried over Jntp t^ie test, 
itssns^ Prqd:xt n^sufen^t is appropriate imen: • * 

- • Ji;e objective specifies ^ pj^duct . ^ ^ . 

• Tbjs product can be measured as 4o eitJaar pj^ence or characteristics 
(sudi as voltage, length, etc-). ' . " . ' 

m • 

/ • Tne procedure leading to tJ^e prodiArt can vary witJiout affecting 
the ^»oduct. . . ^ - ^ ' " 

process roeasurenent is indicated ^Aen' ^'objective specifies a 
sebuence of perfontances Khich can be (Aserved, cind ii^en the perfpnnJnce 
'as i3!p0J-tant as the product. Frocess aneasursnent is also aj^rbpHate ' 
ite the product* caj^ot be distinguished from the process or Misery the 
pxx>duct cannot be jaeasuredifoir. safety or other constraining reasons^ 
Generally speaking, prpcess measiir^aent appear^ aj^propriate When; 

- • • Diagnostic infomation is desired, 

^Additional scares are needed on a particular tas^ - 

• Tl^ere 5s no product at the end &f the process* 

• The prpduct always follows from the process,, but high costs 

or other practical constraints pre^^ent measurement of the product. 

Following are descriptions -of conditions which my call for^oth 
product and -process. oeajCur^ment; ' \ - * ' - 

^ • ATthcyjgh the*product is more important than the prot^s^ that 
led to -its completi'on, there are critical points in tljs processes 
which, if faisperf onned; may cause darage to personnel^r equipjnent. 
• # > 

TKe 4)ro;5€SS* and iSrodiict are of sinnlar. iic^iortance but it cannot 
be,^s6raed:^^ pYixiuCt will roeet criterion levels. Just because 
the tirocess followed at ^criterion levets. r 



the, fJrocess 1^ followed at 'priterion levels. 

• Diagnostic ihforoation is needed. By having process oeasures as** 
well as the product ineasurs, infonsation as to vrfiy the produtt 
does not meet the criterion can often be^btained. That is, ii^ 
the product. does not nieet the criterion « then soniething which* h&s 
been, done wrong in thfe process i3y be df^coyejied^ 



- IChen both proves and proiuzt insasHres are taken mr a Qiven objective, 
stx^ring sniist, follow the,aritBrim specified In the objective. Tlbat is, 
if 'tfee criterion specifies only a product, not a process, tbaii process 
scoi^ cannot be tjsed to assess adjie^anent of the crlterli^. * im"is,^of 
coqrse, does not preclude cbtainiog additioDal process i^fonr^tiqp wt^ere 
su^ lufcrmatlon is useful an auxiliary icay (for exaiHule, as diacnostic 
infomaticnj and is feasible to obtain. 

One classification has suc^ested tfcree types of tasts to lll^istrate 
the relative roles of prodirtS and process measuremsit: 

1. ' Tasks ^vtere product Is^ the process- ' * * ^ 

2. Tasks in which the product always follows fros?. the orocess- 

3. Tasks' In which the product snay rollow froiB the process - 

■Jielatlvely few tasks are of the first type. Drill and cerenionies, playing 
a jBUslcal ins truinent and public speaking are examples, ffore tasks {s^ch 
as flx^ procedure .tasks) are the s^ond type, in t^^ese tasks, if the * 
process Is correctly executed, the' product follows. Rjn^xgsiple, if you 
pack a pai-athute by following the correct process, the product, a properly 
packed parachute, will follow. 

f 

A Targe number of tasks are of the third tyoe, »d}ere*tJ)e process appears 
to haye been correctly carried out but the. product was not attained. Tha^ 
are at l^st two reasons why this can happen: .Either we wer^ unable to 
specify fully the necessary and sufficient steps in task i)erfornBnce,^r 
we did not accurately measure than* Rifle firing, for example, illustrates 
that there is no guarantee of acceptably naxksmanship even if all procedures 
are followed. In this case, process iseasurenent would not adequately sub- 
stitute for product peasurenient. So, before using a process measure, ask 
yourself this question: ■ <r* / - 




• ?If 1 use only a process measure-to-fcest a nan/s achievanent on 
a task, how c^ertaln can I be lTom this grocess score that he. 
would 4lso be able to" achieve the product or outconje jof the taslc?** 

if your answer Is "I can't be very certain," you*d ^tter-add a oroduct * 
measure. * - 



Jfow,.-ierVS[oo^_at^typ of Items for performance tests. You wifT 
$ee that these ^i terns canT>e used for process q)r product measures^ent. 
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Types of igens for Perfonrarjce Jests: ?roc^s gatir.g 



Ifhsn *jsing a rating scale, you shauld specif the rating a stuSent 
needs to achieve tiie perfcrnaiice specified by t^ba d)Ject1ve. -rbr ex^le, 
3 soile froa 1 to 6 mght be lised to rate^^lic speaking ability. (See 
figure 3-6). Here, 6 is the aix^tabi'e standard for a^Aieving the cbjec- 
tite, Khlle 1 is the.b^i(imnyner2f. 



is poor speaker but 
speaks without 
speech iirjpediinent 



Spaks flaently in a"" 
wel^-TOdula^ voice," 
is interesting, does 
tiot pause inappro- 
priately, etc. 



1 


2 


3 




'\ 


6 



'Hating needed 
• for entry into 
the course 



Rating /leeded to pass 
criterion test at end 
of course 



Figufe 3-6: Sample Jiumerical Scale for Rating Public 
Speaking Ability 



Such a scale might also be used to*3ssess entering, behavior at the start 
of instruction. For example, a student may be required to achieve a 1 in 
o?der to jsnter the course. U he already can perform at 5evel 6, he my 
not need the iostruction at all. • 

- \ 

The rating scale ifey also be used to infonii a student of his progress. 
For example, he 5iay be rated ooce a week throughout the course, ajid froni 
these scores be able to pace hinsBlf accordingly. If students consis- 
tently^ fail to obtain the rating necessary to achieve the criterion per-, 
fanrancei revision of "the course^^curriculusiinay be indicated. Consis- 
tently low perfoiTnanne ratings require increasing amounts of revision.- 
When a student achieves^. the criterion, no furtlper instruction is neces- 
sary- Rating scTles, lilwever, require observers to score perfowance. 
5o» the- Scoring is based -on judgments, whicfat sometimes aakes the ratings 
unreliable. The nxjre -clearly specified the perfornance is at each 
rating scale point, the core reliable the ratings will be. Figure 3-7 
shows a be'f^er rating scale for oujilic speaking ability* 
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1 


. 2 




3 

^ 1 C / 










i " / 1 




Is poor 
speaker bvt 
speaks with- 
out speech 
ij£5>edijnent 


t ; 

J3as 1 
"758rvoi;s 
nsT-nerisns ' 

•* 


^ys 
a lot 

* 


Presents 
acceptable 
speech but 
delivery 
is too . V 
slow or is 
not suff- 
iciently 
clear ^ 


Presents 
acceptable 
speech 
but' is 
boring 


Speaks fl^iently 
in a vell^ . • 
jaodulated voice, 
is interesting, 
does not' pause 
inappropriately, 
etc. 


Figure 3-7. Sanple Behavioral ly-A5K:hored Rating Scale 



Nevertheless^ errors are easily rade ^n rating per^onsSncesT so let's look 
at several different types of rating err^s and- ways to infninjize thsiu 

i " " ^ - c 

Since perfonnance tests require the trainee to display actual outputs 
(product or process), they depend heavily on actual observations and rating 
of outputs. An examiner should rate perfbnnances or products under con- 
trolled conditions which should not change from one trainee to another. 
Also, the' same perfornance standards should be used with each student. 
For^exaniple, a scale of 1 to 7 may t>e used to rate ability to drive a 
truck. Figure .3-8 shows such a scale with a rating of 4 specified as. the 
standard acceptable for achieving the criterion. .\ 



I 2 . ■ 3 . 


^ -'4 


5-6 7 




Rating 
needed ^ 
to pass 
criterion 
te&t^ 




Figure 3-8/ Sample Numeric^' Scale for Rating 
— Driving a Truck\ 



This standard should tJe tHe saiae for all students (A 7 ro^ns that the 
truck was- driven Ui the best possible iranner). Ayrafing of 4 should mean 
that the trucJc^w^s driven -to rainirgum acceptable standards; ideally, all 
raters shouldt^egree as to what -these standards a"re7\ ■ ' A. 
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The problem of ratiag scales lies -the diffanng ji^^OTent of the 
observers. These differences (or ratirrg errors) my be classified into 
four categsjries:. ' • ' 

1 - Error of Standards > Errors are sometinss cade because ^i^diff 
erences in observers* standanis. if rating is done intfeaS^ny 
discrete, specified standards, there ^naj^e a$ nany differenS 
standards ^ cbser%^ers, tJiereby causing, overrating or imderra't'5^^ 
JStajDdards at- each f^ifit in t3ie sc3?e «ist bs clearly specified. 



3. 



Consider tHe following exainple: 

Ten persons are siinultaneouslV being rated oo their 
swijining ability* Judgments of the observers vilK in 
this case, be dependent on their of swinging stan- 

dards-and thei^ relative experience in the area. The 
TTore knowledge and experience they havS in the area, • 
the mre nearly alike their ratings of the students 
will be. ttore ijnpais^tantly, the ntore.the swinming stan- 
-r^rds can be specified in tenrrs of actual behaviors 
(for example, "legs do not bend at knees while 
kicking = 3"),, the better'the interrater agreement. 



1 2. Error of Halo . An observer's ratings pay be biased because he 
\ . allows his general isnpression of an individual to influence his 
' jvdguJent. This results An a shift of the rating and is known 
~ . as ait *'error of halo." If the observer is favorably iinpressed, 
the shift is toward the high end of ^e scale. If the ijupres- 
sion is unfavorable, the shift is toward the low end. This 
type of error frequently goes undetected unless it is extreme. 




specific perfor 

his impression of the individual as a >i^ole^ 

Logical Error . A, logical error my gccur when siniultaneously 
rating two or nore^ traits. Vhext an observer terids to give 
similar ratings to traits which area't necessarily related, 
he is making a logical error. It nay appear. to him that these 
two traits are similar when they really aren*t. It seems 
logical to him but more than likely doesn't to the other 
observers. For^ample, if ^efficiency" and "productivity*' 
are both being rated, some observers may think that they are 
tiigKljrrst^rgd. /Thus, tftey would tewMp rate both traits i 
at the same level: If a person is efficient, he must be 
productive. This isn't necessarily' so, but a logical error 
is easily made i a such cases. • - * 
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I1:2^y to mnis&iz^ logical errors, is fe) saJce the distinction 
among di^feranl traits toM rated as clear as possible'. Point 
0rrt"to tfie raters that only se^srate^.^indeDemSent traits ar^to 
be rated* if possible, z?iye e^uples'of the Isbavicrs associated 
.arith ^ch trait. : 



4. Error of Central Te^deng, iAn error of centrai te^dertcy is deron- 
stratesi vhen different raters tend to rate mst stuSents towa^^ 
'the mddle of the distribution, if, for ex^le, the S:Ca1a has 
seven points and y^u get a larc e nun&er of 45 froiH your^ rat^rs^ 

, ' they iTsy be exhibiting an em)r of central tendency.- ^ 



Cne ley to -ci?ant©* this is to use ratiffg scal^ urith 'an elren 
{4, 6 or 8) nunjbear^^f joints. Siich scales have nt) snidpoint and 
y6u t^refdre rorc^ raters to spread their ratings Jiore than with 
a scale having a nridpoint. The best solution, however,^ is to 
anchor your rating points vnth words which describe the feehaviors 
and/or perfonuances required (as shov^ in Figure 3-7). 



Let's now look at a few specific types of process rating methods. 
There are several types of scales for rating oerfbnnances that are obser- 
vable but transient. You can use: 

• A nunierical scale 

• A 4escr1f^ive jscale- ^ ^ ' ^ 

• A behavioral ly-anchor^d rfunierical scale 

• A checklist . • - • . ' 

If at all possiblef^se the checklist,, Tne cffecklist is generally derived 
frojn job perfonaances and is the nost reliable>>ting scale. 



1. Checklist . A checklist is^ useful for rating^-ability to oerfonn a* 
;set procedure. ^It*s also a siinplfe inethod of rating skills Khenr-^ 
your purp6se' is to see If students have reached a certain minimm 
Jevel. The perfornance is hSokenyJowrfin^o .element s^ which allows 
the observer to indicate whe^rVeach step has been soccessftilly 
achieved rather than inerely whether or not final perfonnanQe has 
' been achieved. This helps to reduce t*he error of standards be- 
cause it tends to mininize subjectivity, instead of a large ntanber 
. of categories fro:n whfch the observers may choose, there are only 
t^, "go** alid "no-go** on mny different, itesns^i 
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2- fiup»$rica1 Scale. A numerical scale divides/pa? f or r»ance into a 
fixed number of poli^ts (g^$ater than <5??endfn3 on tise num- 

ber of -discriminations required and tfie ^^rility cf t*32 raters to 
iiafce tJ^e discrijuinations. In irost cases, observers can ^lake 
at least five discrtiBinations reliably, but not ^re 'than nine, 
so fTKJSt nunerical retina scales shg^uld a:»nt3in five to nine OTints- 



3- ' Xtescrlotive Scale. The descriptive scale uses piirases to indicate 
levels of ability rather than nunfeers. ?iene, tbe dis§riininations 
can be, .varied to. suit l&e perfonraDoe, JiakTBg such a- scale mra 
versatile thafi a njimerical stale. * However, 'ihere are also disadj 
vantages. One siajor disa dvai rf^ge is xba interpretation of the 
phrases. A phrase ^itax no^Jiasn* the same thing to all obsen^ers* 
The inore behaviorally descriptive the phrase, the 6etter. Another 
disadvantage is the difficulty in selecting phizes which describe* 
•degrees of perfonnaoce'vrhich are **^uany soaced.* For exaniple,^ 
many observers consider *'pD6r'' and "fair* to be Tuore closely re- 
l^ed than ^fair*' and "qood.*^ 

Sehavior^lly-Anchored Kuiier'ical Scaje . T?3e bei^aviorally-anchored 
numerical scale includes a numerical scale alona with behaviorally 
descriptive phrgses b^ow each mnnber. Both the number and the 
phrase msX- be considered by tJje^ observer^ The description can 
be a single word ^or- can be relet! ve"iy detailed. The more datailkd 
the descriptions, and the «nore they describe actaial behaviocS-f-^he 
better the rating results are. - ' ' 



Types Of Iteias for Perfonaantie Tests: Product Rating 

Product rating is inore relia-ble^han process ratino/since a product 
is usually t^gible* 4fter completing a perfbniance tKt, the orodiict 
produced is compared with the required product, frpn this comoarison, the 
rating is produced. This procedure inininiizes. inany rating errors, since it 
provides the obsenter with a tangible stand2rr^^i:0i which to coiuoare the 
product's suitability^ . — ' ' * * 

' .Product rating nethojis incujde the sanje cain types as process ralirig 

methods: ^ ' ' ' ' ' . ' - ^ . / . 

_ * • • • . 

* • Chec^:lists {go - na-go itenj?) . 7 " 

^ ^ liusiarical , scales . ''^ ' . ' 



♦ Descriptive-'scales 
• « Beb^iorally^ncbored n^riefical scales 



tgr example, a proiijzt checklist fcr attaching a bayonet to a rffle snioht 
^iconsist of itsns s'jrii as tbB ftollovrina: 



Circle one 

• Is the baycnet firmly attached to the rifle? {go • nt>-go) 
, • i s Hie bayonet' positioned properly? {go - no-go) 



A behavioral ly-anchored rrjiierical scale for a product (carr^tly-^pp^ 
Sfarkpl*ig)*isight look lifce i^is: 



1 


• 

2 


' 3 


4 


5 


Sparkplug gap 
off by Z .004'* 
of specified 
tolierance - 


SparkpluQ gap 
off by r.0D3" 
of specified, 
tolerant^ 


Sp3rf4)lug gap 
off by Z .002" 
of specified . 
tolerance 5 


•SpariLplug qap 

specified 
23bolerance 


Sparkplug gap 
set at exact 
toleraix:e 
.specified * 


* . ^ Figure 3r9. Sarnple Behavioral 1y-Anc^)or^ Rating Scale 

> 



Exaniple of Oeterwfi4sg Itam Fonrat and 7jest Fidelity 

How that you are i^liar witi? different types of items, and theif 
advantages^and disaiJyantages, you should be able to iiake a considered 
judgment of the^yps required for each of your objectives- iften^u decide 
what type of itsns your CRT should include and tbe JiecessSry Isvel of - 
fidelity, dooiige^t your decision so you .c^n refer to it when actoally 
start construtJting your CRT. 



Let^s look it an exaisple of determining appropriate ftera fonpat 
^test .fidelity. ' N ' ' ' , / 



and 



Asstipe that ypu are planning^ CRT ta ^over a block of instruction 
orf gresenting*oraJ briefings in..^ l^ership coupse. The specific objective 

is: ' . ' ' - ' 



.• Given four hour$.^, library researdi, -be af>te to prepst'e and 
deliver a IQ-aimur^riefing to a General Officer on tbe.sfa|as 



/ 
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of oil s^iale depDsits as a -najcr.pDtsntlal source ene^% Hr 
the U.S. Pjmj. 7ae l>rieftBg jnjst present clearly and %ucci^tx\y 



1. How oil s?iale is fonred 

2, iihere oil shale d^x^its are foiind 

3- r^tential pro±K:ts afid oses of oil s^ale 

^ 4. EstlirateS amoOTt of oil shale ^.n tfte contSjip^tal ?LS. 



Sow, wjat test fcmat do you usai Voy obviously do fcave a sj^re 
SeneraW)fficer available to practice on. A?i a^nowriate C2T fcmat here 
iBight^^o^an oral presentatiotn .to the course Instructor scor^ en a §0 - hchoo 
using a checklist to reflect appropriate aspects of coyerage and presen- 
tation (a fairly hiqn level of test fidelity). A test having a men lower 
level of fidelitjf (and certainly not recorinended ^jere) woxild be a paper 
and pencil multiple choice test on knowledge about oil shale deposits, 
aj^d principles of oral presentation. • • - 



From Figure 3-1, you can see that once itein ft^mat aj>d level of 
fidelity are planned; the next coasideration is %!r-hether or not items 
should be sanpled for objectives, itsn sanpling witijin objectives should 
be considered when there are large nunfi>ers of items that could be created ^ 
-^CT an objective, if an objective calls OB^y for a few specif fc itens, 
(such as carrying out fixed procedures) S^ere is no need to sanple. 

San^^ling Kithin objectives is often tnec^sary in situations where 
the objectives to be tested involve abstract concepts. ExanplB^ of sudi 
abstract concepts include: • j 

• Katheratical concept (addition, inultrplication, differaitia- . 
tion, vector analysis, etc.) ' " . ^ 

^ . •Categorical concepts (identifying' species of plantlife, recognizing 
syqjtons of eniotfonal disorder, selecting sjitable positions' for 
defensive fortification, etc 4 * : 

^ •Problenj solving ^be able to troubleshoot and identU^^e sralfunction 
in* any internal cocnbustion engine} . 



Item S2n5>1ing nfitfji© an c^jectiya •JSiialljr occurs In sitsBticns Wfiere 
object! ve Va:[ J ir^ learning a ccTic^t {sodb as additiOT) as opposei 
to a process r^EfSJiHng a fixed onfer of dDiog tfti^igs (s:jc^ as fo1di«>g 
an America??. Fla^ cr issyin§ a, call fcr fIreK . y 

. In cases of t^chi^ conc^S it is g^erally not paTssiile to develop 
tst iteKS for all -jfessvble exaiiples of tfee o^ncept. (insider the concept 
of ad^itfOQ. if the objectiye specified in the traioiog prograci concerns 
le^iijg to at5d two t^ree-digit ranbers^ develbpnient of a series of C2T 
itens M^jScb effectively tests, all ?K>ssible two-^ey csnifeinations of three- 
digit njnbar^ js virttSTiy iupDSsible,- Hesce, GIT itens msi ssnple from 
tf>e^4>^pj1ation of 5tens which co^ld be generated to test tbe concept. JJe 
uright, for^exainple* develop five 5r six itsns, &ch of vihich call for t?>e 
addition-^f t^eenSigit njnfeers, aM assume that if the criterion had 
been iset on these items, the stud^t p:>ssesses adequate knoKledge of the 
ccnc^t to gerieralize to any series of two v^ree-digit nuKDers. 



J^e more difficiilt it is to learn a concegt, atrf the greater tbe num- 
ber of '^possible items in the concept class, the mre items will be requited 
in your sanprfe, l« general, the anore aspects .there are to leara about a 
concept^, the ^liore difficult it is to Iraro. 

✓ - ^ - 

Also, the iiore aspects a fonc^pt has $fet are siiBilar to another^ 
' different concept, the irore difficult it is to lear^. for ex3n?)1e^ if 
you are teaching people tor^^nize types of qoartz^ there are a nunfeer , 
" of. aspects of ijuartz ti^fT^u^l feave to cover — hardness* shape, etc. 
T3:ere are alsq ^ n unfeer 'of aspects that quartz shares witJi other mneralS'- 
because of thesis imilari ties, teaching recognition of quartz mil be more 
diffioilt: Tbe student will have to learn to discriminate between quartz' 
^nd other minerals having quar^-like aspects. 

^ There are at least two. other factors that affect the nmber of items 
necessary for sati^ling within objectives. First, the relative importance 
of a correct classification— -wbe^er or not the trainee has nastered the' 
conc^t— should help delermtne the number of items necessary. If it is 
critical t}}at a trainee master a concept, ibre items should be included 
for the objective^ to ensure that tbe trained can accurately. apply the 
concept. For example,^n survival 'training, an individual Jirjst be 
able to distinguish bjj^^ edible plants and poisoaous varieties- So, 
relatively large nts^lPof items r^uiring the individual to discriminate 
edible from nonedible plants is necessary* , - . 
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Another fatter HBt my affezt^the njjnfoer of iisns required }fihBn 
sampling yrithif\ cbjectu^, is limtaticxs infosed by practical constraints 
^ IJjat Is, often tine^aiailaMlity, costs, etc- my rot a1!t?w.yci3 to incl^/de 
as atany itgis as snignt othennse desirable. 

ttecament >^ar iteip. satrpling plan so ya:* ^31 have a recsjrd -id^e?! ere- * 
atiog itens. Tftis plao should dascribk t^e chars cteristics tiiat the 
itens to be sampled sbiyvld have- . 



Sigpld Per^rnantes be Tested lender $ifmle 
Under toltlple to^dlticns? 



* in Jiany siriatiDT^s, CjTT perfcmances require testiisg under mltizrle 
— 5:$adiXixms^r__Yoii inay need to perfomi certai^i tasH under bot^ <fey1igtjt 
and n1ghttiroi~con3tt^^^^^ As anot^ier exarnple, astrcrnaats 

mist perfora cer:ta3n iiaint^ance tas^s^J^tii inside tJse spacecraft and 
during EVSr^extra vehicular activity) ojteics tfee^^afi while tetJiered 
by a ftfeline* You my have to perfbnn tasjs under ov^rfoei^ ^conditions 
ii^luding hi§h no>se levels, humidity levels, tenperature levels, _ 
so forth. ' - ^- . 



Gne Job whic^ you as a CRT developer vrill have, is the detemiration 
of conditions under y^iicb your test will be administered- Your objectives 
will 5peG! fy the condition or conditions required. Often, you my need^ 
to develop test items which a>uld be administer^ und$r laultiple' conditions 
for situations in which performance inust be exhibited oader a lai^e 
number .of conditions, you my wish to devise a s^rpllng plan to ^uide 
you in detanining which conditions to develop test_it^ for. {This ^ 
assumes tiiat it is iin^ractical to test ^jnder all possible conditions^) 



for each objective uf^n which a test itaa is to ba, cojJstrueted, yoii 
should examine the range of conditions stated. I^ext, yoii skwld^cake a . 
1ist%f these conditions and raiS: them in order of priority. Figure 3-10 
presents; guidelines for testing under inultiple conditions. 

\ • ■ ' . . . - ^. • 

JSien developing a scheme for sampling among a large rmper of testing 
cxjnditions, rank the cwJditions in order of importance, and ^develop a CRT^ 
item for tSie p erfor mance under each condition raHJced in the top-3& j^rCe^t. 
The top 30 percent should inclid^ a!l the aore critical conditions; if 
, it doesn't, you ral^i r^eed ^9 test under jsbr^ tfen 30 percent, of the " 
•conditions.. ' | ' • " . ' • - • - 



• If lha perfcrnance mist be exhibited under of two 
ccnditicLnsr-you-shrrjkJ develop a CHT iteni for each con-. 

• if the objective states that tiie perfcnrance ?iay £2 

• exhibited ander either of two conditions, toss a c^in 
and pick a cc^iditicn. 

• If the perfornan^ mjst be exhibits under liirea 
cs^itions-^you should dr^elop a CHT f tan uiiich tests 

' the perfomance binder the two TOst inportant o^nditions. 

• If the perfoma^ce nitzst be exhibited under a lar^e 
rrjinber of- conditions—you should develop a CfJT to test 
the perfomance under at least 39 percent of the nfeces- 

• sa ry conditions, Se sure to ioclade the Jiore critical 
conditions. 



Figure 3-10- Multiple Testing Conditions 



Let's consider an ©laniple: Assume an objective specifies testing 
siarksnanslilp accuracy mth an Tbe trainee is alloieed to fire ^ 

rounds of arnmuniticn at a stationary target and Jiuist place at l^ist 10 
rounds witJiin the faulHeye. He must do this imder the follwifig 
conditions: - m* i 



_ • Daypime and nighttijne {iTTunnnaied rangeT 



- • iiind prevailing from left and fronj right 

Mind velocities of 0, 10 Ji^^h^ 20 inpb, and 30 jHph- 
Tlfese con9itiqns combine sixteen, vrays, such as:, 

• OaytfiUe, no Kind • • * 
•"OaytiiDe, Kith niph prevailing Kind from the' left 

• Wighttime (ii3uninated range). Kith 30 snph pr^aillng 
Kind from the right 

^ • : • ' . 

>Etc. 



th^e a 



Since th^e are a large nuri>er dT~£DmfH4on$ and you can't test under 
all of thra (for practical reasons), you should (tev^Jop^ST ftens to tesi 
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oiarisnanship proficiency under at least 30 percent of thesa. ' l^j^r t?)e con- 
6Hions in order of inoortame^ and develop GIT items for at^i^st the tep 
four items (Sn percent of 16)* Here *rind veJocity, dfrectfcii, and day/ 
fright conditions are iinportant. So*, you may wish to develop items for:' 

• BayttJie, vitii 30 Jiph prevailing irind froic right left 

• liighttiiie-, at an illamireted test range, irith 39 ^nph 
prevailing wind froiB left to right 

• Oaytiine, no wiAd 

• MignttiTO, at an illtmiinated test ran^, i^it^i.^ inph 
prevail-^g wind froiu right tx) left 

By ^sting under the irore diffigjlt cc^nditions, yoy can usually be sure 
tfeat the trainee can perform urider^ tte easier conditions* in this example, 
though, one easy condition is included: ""^Daytiine, no vrind.* inclusion 
of this condition is an aid to diagnosis* That is, if you had only the 
more diff^icuTt conditions and the trainee failed to perfonr^ st^dards, 
you wouldn't iuiow if the faili^e was due xO the diffieulty or.^^^is^ndf- 
tions of just an inability to perfonn the target shootfng in general- 
Thus, the easy condition provides a cheeky- 

Document your condition sampling plan so you will bare a record ^ben 
yoa create test items- The sanpling plan should indicate the conditions 
(or combinations of conditions) under which the trainees will be .tested- 



BETERMimKS HOW m^^^ ITDS to IKCIUDE in 

wuk TEST, AKD ooccBcmnfis y(m test plam 



-One task remains: You inust decide how inany itssjis^your test should 
include. The answer to the question, *Uow iiany items should I create?** 
depends upon ihe,objectrv6: The more complex objective (the more 
subtasks it includes) the !nore items will be f^ufred to'test it. This 
is true, but it does not provide enough guidance in decfsion-naking for 
the item developer^ Two other basic facers govern the nuiaber of items 
to be developed: - • 

^# The variety of conditions unjfer lAich the objective inust 
be tested. ^ ^ • 

• The objective^'s level of acceptable perfornance, specified 
3S standards. i ' • 
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Tbe first ffactor, variety of a^nditions, fas b^n «>verad In the * 
preceding secticii in terns of sanplfng arong Jiul tiple co^itions- There 
are, Jjowever, object! Khfch do nat specify inultiple ojnditions, yet 
i«fcic£'iiay1ogicaTIy os testable under ^rany o^nditions. f^^^ example, if 
an objective r^jires a pilot to be able to laisd a light plane on the. 
Jiain -east/iest landing strip at ^Iles Airport in Virginia, juight be 
able to test the objective with one item (that is, actually r^uiring the 
pilot to land his light placp on that runi^yK^^^t, if th? objective 
regui Jnes *the pilot to Tand on any paved airswip, ffilStJst require the 
pilot to 5nafce as aiany landings &s va feel are necessary-^n ^rious air- 
strips, ander various conditions* la doing tiiis, we Bre considering the 
range of contiitions specified in the objective when jtfe detennine the 
nunfoo- of itens in Ahe test. Develop as Jiany items as are n£?dpd to 
deronstrate that the trainees can perfonn under the required €X)nditions, 
sampling the range of objects the trainee jnust work inth, and the range 
of conditions under which he must work. 



The second factor, level of acceptable perfornance specified as 
standards,- must also be^^-es^^ijleced in deternrining the nunfeer of items to 



include. You inus t^J/^Tude enougl? items to ensure ihat the standards are ^ 
met. For exanipl^^ suppose an objective states: 



^Gi'f^ the appropriate sparkplug >crench, be able to ranove a 
/sj^rkplog from a 1970 six-cylinder staff car in one jninute. 



To laeet the standard as stated in this objective, a trainee needs only to 
renove one sparkplug in one minute. Suppose the trainee reroves the plug 
in 59.5 seconds but he is rushing frantically* He passes the item, but 
you aren't sure that <it isn't a mtter of luclt— you're not certain that 
he could do it ev^ry^^me. In a case such as' this, yali inight want to 
include two orttnree itenis* Each item must watch the objective though. 
Thus, you inight plan three items: * . 

• . . .remove the #6 sparkpliig in one *iiiinute. ■ 

" ^ . 

• C . \re5nove the #5 sparkplug in' one tnixiute^ 

*^ ^ * ' 

• .^:remove the. J2^sj^rkolug in one siinute. 



You niust plan these items before the test, anjd.not vary them during testing 
Actually, you are modifying the objective to state: *. . .remove Oiree " 
sparjcplugs. . .in one minute or less per plug." 

Consider this objective: 

• Given your position as ofaserv^ and the position of the enemy 
and description of his materiel > issue an appropriate call-foc-fire 
. . ^ according to the SOP. 



If tb§ itainee ^ve a corr&:i call -fbr-f ire but stunfoled in saying itfy^^ 
liiight be unsure ^hetJ^er he can -meet a^e chjective. Jhus^ you might Write 
several itans, eai± rei?uiring that a •different call-for-fire be issued. 
^ Ssyeral itenis'waiald also allow for a wider range of stitolos c^nditions-- 
ybur position, eneiny pDsition, an£ isscriptif^n of ensny Jijateriel cx?ald 
all be varied* Again, you are aiodifying ihe objective to itfiieve a Jiore 
acojrate jieasure of the star^and—tfcis airjst be done before the test is 
given. It is never proper to add iteins doring a test atforinistration. 

So- . diet's recap the g^eral conditions for detennining tise timber 
of itsns to saTnple the range of perfornances and a)nditic^js* Ife jnast 
r-^^^sate enough itCTS to satisfy ourselves tiuit, if jjassed, the trainee has 
joalT&e staniferds. ^Ife jnust also be certain that ^ch itera ^t<±s the 
objectives even if there are^iiany itests for a givsi objective*. 
♦ ' . » 

Da not get yourself into the a)ncefxt?jal dilaima of stating that *^en 
if tbe student perforins these four items i would not be convinced he has 
^stered the objective.* if you find yourself in this situation— write. 
mre items. Gn the ot^er hand, the test writer smzst ouard against writing 
large mjuibers of itenis which test extremely rare performances under unten- 
able and hard-ts-iiragine conditions. Simply nake sure that all objectives - 
are adequately sampled,- and that all conditions. and pj^rfonrances are covered- 
without being unreasonable and without writing large nunfcers of nitpicki^g^ 
items siinply to watch the st^jdents squinn. It is important, however, that 
you sasnple all objectives, cover the necessary perfomances and conditions, 
and adequately coyer the standards. 

• ' - > « • 

The 'reliability of your test— tte extent to Khich it wf31 measure the 
same thing each time you give it— is influenced^ by the length of the test. 
The iiore itsns you have on a test, the more data wrill be available for 
naking detenninations about test reliability. (Rwiability is covered in 
detail in Chapter 7.) A good rule of thumb is: Write too many ftems rather 
tljan tc?&^ few. You can us^ tJjose whi^rh are left over to develop j)a rail el, 
or^TSKei^iafe forms' of the same test^ or you can conduct an item analysis 
(as wili^ -discussed in Chapter 5) and .eliminate unnegsssary and an&isuous 
«items-:rkfeepingTpnTyThe best ones' for ,fiie final form of j^our CRT. 



The J€st-Plaa.VDrksheet- 




^^Gping a test. plan, we have discu^ed: 



• Overcominppfeotical constraints by selecting, 
among objectives or tpodtfyinajobjtetives 

• Planning liem format and level of fidelity 
♦•Saspliflg items within objectives ^ - - 



• Sampling aroang ihiiHij^ls conditions 
• .^"^ ' " 

• .♦Deciding hoyt'kany items to include on t&ie test ^. 

Figure 3-11 shvws a worksheet which vill help to collect and orgar^ize 
all tJbe dDCunentation of the zest plan that you t^ave developed. A'mrk- 
sheet stich as this ones^bould be developed for each objective upon vihisb 
yoti Arish to construct a CRT itan. Figure 3-12 shows a sanple worksheet 
filled '>fi for the objective: 

\ ' /' . • 

•%ivai a set of cliiifeer*s spikes and a safety ^ 
strap, be able to clinD a 30 ft. tel^hone - 
pole in 2 mniites,**^ 

as well, as for two other related Gbj-ectives-./ ^ 

■ • 

Kote that you should fill in the "nisrber of items" column with t3pe 
nuncer of itezns required on the fiml version of yoifl',test . As ycu will 
see in the next chapter, you will create mre ItsiK tten-this so that 
you can selecc 13ie best ones fay review and other technt^ues. By creating 
such a worksheet, you will have all the infoniBtidfr needed for developiag 
a test. ' . % - 



V 



78 

3-32 



o o > 

^•7~ * 

O 'CO O 

' ^ £ O 


> 


i 




3 


i 

— oc^ o 
o ^> 

^ ^ o 3^ 


« 

♦ 

• • 

• 




• 

- ^ 

* 


• • 


O O. d u u 
a.-*^ 

= ' ' 3: 

CO ^•'O^ — 


« 


* • 




* * 


^ ca ' 

0.0- . 
<^ 








> 


• — 
S <o ^ 
c; <A o 

o V o :y 

o o o 

O O- 


* 


r 






> 




4 


• 




o 


• 








, O CO >> 

^ <3 o c? 
O o o -r- jr o_ o 




> 






> =5 O C3 
S O to 

o.-fc> o o s. 
o «o 

0*-^-*^"0 to . 


> 


> 




* 

• 

^ — 



Fi§(fre 3-11:, T^t Plan Worksheet 




. o o o o -f- 

o u ^ n o 

»— O C -f- <3 

^ >-t. _c a 

to ^ 5 o CS- v> 



I I 

> 3 o o 

S U 05 

^ CO 

o a o s» 





m 


1 




to 


o a 


."IS 


«- o 




■ 5- 




c o .« 
O S 






> jO 


















t 




, o ^ s. o 







o o 

CO I #— ♦ 

o o c 
^ ^ JZ c 



O 

o o * 

^ c o -f- 

o c to 4^ H 

O O O <3 

O C3 o--»- o c 



o o 

CD a f— * o 

p o ^ -*JH-'4yc5. 

^ >• jr ^ o 

<S c3 O t-. 



Figjjre 3-12, Sazple Test Plan Worksheet 



CHAPTER A 



C0l{ST1?Dai«& THE !J£K PODL 



^'Construcl^iog tbe it^(pooV is Xhe process of crating a gn^ of 
iteins froirt t^hich final test rtens^fill fee selected • the test plan , devel- 
oped in the preceding chapter, docunents the characteristics of the itenB 
necessary for your test. You hav'e specified in, your test plan the nunfcer ^ 
of itens r^qsjired for eath objective. Yoii should create enough itens to 
satisfy yoursreUvt^at, if passed^ the traine^.h^s perfomed to the required 
-Standards ansiar the appropriate cooditions. it is advisable, however, to 
actually create about twice as gany items as specified in the t^t plan . 
This wiH give you the latitude to choose the jnost appnopnate itens froip 
a large iteiB pooKrather th^ to settle for th^ exact nuHiS>er you have » 
created. You can tryout and review the iteni pooK and select anong the 
itens. In addition, extra iten^ can be used to create alternate test forms 

Where the test plan calls forgone item, you should bui,ld two; where 
it calls for two> you should ^r^ate four^ Thus, if the test Plan s pecifies 
that an objective require^ fdiir i tents, tv^»o under each pt two cofiSTtions, 
you Kould construct eight itei'ns— four, under §ach of the two conditions. 

^ ^ Figure ^1 {fo^dout at the end of t?iis>cnapter) shews the sequent of 
operations necessary for constructing an iienj pool, liote that development, 
of instructions is included as a part of *this process: This appliei.both 
to instructions which tell the test administrator how to give the it&i (and 
,tesf as a whole) and to instructions Which tell the trainee .h6w to take 
the:-. iten Cand test as a whole). - . • ' ^ ' 




CREATE iTEKS BASED OH TEST PLM SPECIfiCATIOIiS- 

* ' ^ ' ' ' . • 

the process of creating test tteas iSc.rel^tiYely easy an^ straight- . 
forwairrf, but jcalls for creativity and tngenu"?^.^ Tali^ the^t^iLplafll 
•sheet (co:iq)leted in the operations described in the preceding chapter) and 
follow these steps to ens«re cons.truction df the appropriate itenis: 

♦ tohslder the first objective -listed. If atl objectives are - 
to be tested, start with this objective. If thefe Is a filan 
for selecting' anwng objectives ^ start vfit|i ^ first. objective * 
specified" for selection by the ^)lan. . : [ 



• Kext,. consider -tJje fo/siet, fidelity level, type of jreasnre- 
mnts Bsid xy^ 6i sojring specifi^for ^rii i.teni to be y 
creatB^i^r this x^Jectwre* All i^-as constrbcted for t^'s * 

'obJgct?i^^**?y^t fleet tJjese* specifications - 

♦ liext, lock at €h|Ap>rksbeet, dcrlunm beaded- ^Sainple itens 
tfiti3ip Chjective?^ cd^aiiin indicates i%^jetter itens , ^ 
Kill h^ve^ & be sanded franca lar^ group of a^jpropriate 
-items or not* If laiist be sampled, "iihi's colinnn 

. . 'irai.c^tes cb3r^teri1sticS'4i3^ j^di iten? reqinr^. . 

; * Ti^ look at S^e '*Sainpie n?iong toltiple Conditions* 'coliunn. 
' TJris <i>1umn iirficate^ fee conations 'under wliich each item 
inast tested- T^coliijnn will specify how nkny condiltions 
aiie to be tested af>d vhat these .co/^ditions are.v 1^ 

♦ Fraaliy, look a^the last column, ^Number of items for 
objective.** This cpluinn tells how nai^^. items to create 
for each objective. ^'Ren^-^r, if one iteia must be tested 
under tH*0 cont5itit5n%, you create ta?o items — one f©r ^th ' ' 
rondition* * ' - • • 

• 7i'ow, create tJ^^jSnd of itemf specified in yoirr ti§st plan 
worksh^^t forgone* objective. Then, regeat the entire prt>cess 
for ^e next oSj^tive specified onri^ur test plan wrksbeet. 



■KheSi creating i\t©iis, ixirst note the perfonnancg call^ for in the 
obgective (overt crain intent or indicator)^. Then write the*t^t items 
following the test plan specifications, i^ing sure that the perfoiisiance : 
in eadj it^ writtep fof an* objective- ma tcjjes the perfohrance stated in 
tiie objective- Yoihshould be concerned Viot only il9th-.the perforoanpe 
(ai3tt»ou^*tne"perfornance is extremely rmporfc^nt)^ but ako with con- 
liitions ^hd/standanfe* The rale for 'this i% relatively^^i^le: 

^ '-^ _ ^ ^ J 1 



H^ike the test Ttems include the segte^d^ndifions 
am standards, {no bore, no less) as those specified 
.in*€he ol>j&lfye.-*. " ^ * • ^ . * 



Rezaember to coitealt your, test -plan, tbougb-^you luay be sas5}ling among t^te. 
3|g9c1fied cond4ti6ite* ' v> • - ' 

i£j3nsider the following t>bjective: * - > ,^ * 



.^^ • Given a stor^nooa of tools used daily at-ihe ipofor pool, 
ideriti^fy^'the tools heaJed to refjiate ^Jfenbelt oir any- late 
laodel 'Jeefi^ by* taking those tools "oat of fiie store^oosit 



liiow Si^ppDse'the ipsin asis a st/jient to reHore tools froir. a dasJlc stcrerooiu 
^2t a sjecified-aiStCT pool. Mrxald tJiis 02 an adequate iteni? Kol Wbo^ 
said anythiog-abojt the storeroois beixg dar*? The <^di5ions called fcr 
in the test item are different from tiiose called fc^ in i2ie objective* 
Itot crilj ths pBrU^rranuBs but tije ccnditicrts^^d standards also sb3:ild be 
the sane io th^pbjective and the test itenjjt ffcafs tiie onlyiay you 
Kill find Osjt if tS>e objettive has been acfiilvai. . . 

When writing test it&is, raa^feer to keep the Ja^^guage siiiple^ Tne^^ 
student's ability to conprBbend diffiojlt lang-^age is ordinarily not the 
skill ii3 Question. And renen&er, all indtcat0rs "^oald be wit?3in the 
Vepertoira of tJ;e*stad^t- rc*r exanple, if an iteifl pcesents infoniaticn 
to the sta^t and req^/qr^ hm tjo calculate jte&poiffBr needs for a tactical 

-exercise^ ic should say **Calciilate the reguir^d jranpowSr* or *<iow nany 
men are* required?* Wot , ^'ivalaaie tne' logistical considerations^ and 
advance an estimate ^of persorjpel requirenents purstAant to the infcriration 

; presented 'hereitj^'"' . ^ \ ' ' ' 

Kovfr, let*s consider an example *of developing y(rarious* typ^ of itsns 
f<^r the s^ane objective^ i^Sism that yozj have tfee follovring^obJectSy^ aad 

must develop a test item: - ' ' , , * - .* 

■ — : -v . . , - t ^ - ' . 

- Objective: Tpa student amist indicate the^best positipn for - . 
' ^ locating a light switch to activate a light in ^ - ^ 
the supply closet of a:batallion headquarters 
office. ^ . . \> . ■ . 

^ One possibility is a^standard multiple choice item. Figure 4-2 shows 
swii an item: * ' ^ - ^ , v 





— — T — 1 • 

The best place tQ locate a 14ght sydtdi for Jthejs apply closet is: 


\ 


♦ 


A-." In. the far left insiife comer of the closet. • 






B*. On the right inside vrall of the closet about* one foot 

from the closet door* . • - ' 






r. On t^e left inside yi^W of the cloSet about one foot - 

.from ffie closet jJoor. ^1 ' . ' 






/o. Outside the -closet, about one foot from t}?e closit door, • £ 
1^ af^d 'on the same >^il'' as the jdoor. ♦ - . " . " 






'(AnsA»r =. 0) ^ * • * - ' ^ 

* ' * ' . ~' 






' Figure 4-2. Sarple J&iltiple Choice Test f 





, Jiowever, t^is itesn, require the stodsnt to visj^alize locations 
specified in d30ice5, A tJiroiigb D. T^e talerrt for this fcijod ot 
nsii2li23tk>n 3iBy not be in fee stadents* oonral r^ertories of b^avior 
Inis- raises an inpsrtant psint: 

4^ 



/ U^e graphs, drakfJngs, and photograjns whei n^^^ry 
1 ^i&r* cl^r coniTiusication-. 



Jieepii^ .x^is pDlnt in uririd, another, better psssibilitj for the ^li^nt 
swjZi:^?'*^ objective Is ari ij las tra ted iiultiple dsoice iten; sucti as v*;at 
, s^ift^ in figure 2^3. 



Supply 




SataliJSM Headquarters Office 




Directi^:. Place a circle around the letter which indicates pk^st 
position for the supply closet light 



/ 



.V . 

/ 



Figure 4-3: . Sample Illustrated KulfVple-Choice'Test 



A thinl, even better, ^pvssibiUty is a simulated pBrfornBnce test 
Itan as shown i?i figure 4-4. . 



Siipply 
Cbsei 



I/! 






DirectlQHs: Place arv**X*' at the teist positit)B for locating the light 
s>fitcJ>ji) a^vate a light in tfte- Sisp^jly^closeti 



J^i^uiV 4-4: * Saoi^e SicnjTated Perforaance Te^S^ * 



fif3lly, the bast ^Bite Is an actxjal peHx^niance iteir wi^ere tbe 
student CTters t!je ro^E with a red grease^ p^il is instract^d to' 
*?1ace an 'X' at the besz position fcr locati^^g a 1i§ht ^tcn to activate 
a light in the supply closst.** * Df cojrse» practical a^nstfoints nay 
♦prcriibit ase of sjcb a?) itCT-- 



ADOther poi^t to tee? i^.iHind when creating itans is 'the following: 



Present' "^.e test so it tioes not give the student nifits 
as to t©e correct ansner, but never trake it extrsrely 
dif-^icult siitDly to ensure a certain n-ujioer of fail^ifBS. 



An exan^le of a written item Kitti a- hint inight be: 

*'r^ unfriendly force is shelling yt^jr position prior 
to attlfel:. As soon as the shelling starts, your st^iiad 
should begin a ^ . ' 

1. Orderly retreat to get put of shelling range. 

2. Attack to catdJ the enany by suF?>rise. ^ 

3. ^Advance toward the eneay position. 

4. Kove toi^rd cover in previously prepared positions* 

It t^is i-teiK, gramnetical ccpsistency gives ^ good hint. Choice 4 is the 
d ly Che irhi^ graninatical. ly follows froni .the^-iteni steu since ''b^in £ . ' 
jnose" is proper; >tfhile *begin a orderly,^ "begin a attack,** and "begin 
a advanced are granrnattpally incorrect- ^ - - 

. Henswiber, yonr creativi^ and ingenuity are called for in creating 
itess. Yoo will have to use your inagination to create the best possible 
iteins for jeaeh objective • " . * 



^ DEVELOP AJiD DOCUJQJT iKSTRUCTIOfJS \ 
FOR ITEH USE 

Once you have creat^ the itens for all objectives tested in your CRT, 
. It is neccssai7 to .develpp and <S>cuaent instructions that des^cribe bow 
eacb itea is. administered. Generally, tests consfst of one type of it^ . 
(pkrforrance iteas or cultiple-cboice itens, for exaq)le), so instructions 
sp^ific to each itea are often not necessary-'-gepefal insti^jctions covering 

<S . ' • ■ -' ' ■ 87 , • - ■ ■ ' ■ . 
ERIC-" 0 . ' ■ ... 



test a?j)ly ip all sjc^ izessst {^e ynU discoss general instrjc 
ra last section of tb1s cns^iter.) 



^ &jaet1iTJ5S, tr^ojgn, specific Instrjrctlons fcr aadh Itert ars necessary, 
Jney iiay be 'pessary two reasons: 

* * 

• Tne iteur requires social eqyipnent or facility set:i4?s, 

* . special cofiditions, or specific standards whic^^tbe test, 

• . adrrlriistrator misz implement as a part of adnri^istering 
^ that iten-. • • ' ' / 

• Toe iteir re^jires special instnjcjsons be presented 
to V^jB trairiee in-vrdar fcr hiw to^crCvSipt it. 



So^ soecific instrj^t'^ons are part of itens to Knicti t:hey ar^ 
appended, ^ne Steins cD^ld not t>e adirinistered or onderstood iritfijgut theiTfr 
Tfeus^ yoii Tiust cf^te specific instructions. Since zhey are a part of 
the iteip^ iteir adequacy canno^be assessed yriitboisz thsB. 

« 

i&eri developitjg spejiitic instn^ctions, fceep in mnti the follt»(^in§ 
points: 

• Specific irtstnjcticsn^s should fee placed with'tne itsss to 
Which they apply. Those parts of specific instructions 

• icinicn.the trainee should read are Kritten into the itesii- 

^nose parts which tell the aoininistrator what to do^ sboiild • ^ 
be ir^cluded only in a separate *"adnnnistrator's test copy-"" . 

• Specific instpjctions should tell t53e trainee k^ether 'speed 
er accuracy Is Tore oinpDrtant. Any ti^oe Lijuits should be 
specified. - " - - 

• Provide clear instnictions to the administrator. Tell hiic 
exactly wat to say tojthe trainee, and bow tp ^answer 
tjuestions. {The safesi way is to have tbe-exannner.7»ad ^^fi 
to the trainee directly froni the wijtten directions J 



• Provide diagrams of eqii^pnent setups .and facility arrange- 
rnents for the Sdnrinistrator, whenever necesssary for a . 

/ given it^. £quip;?^t* settings {for exainple, dial settings 
on i::eters) should also be specified. 

♦ Specific ins!bnuctions should tell the trainee exactly >fhat 
the perfon3nce, condition?, and standards ^are for the itenn- 
this is especially inpoftant for handsH3n performance it^ns. 
They iuay also explain the purposes ^of certain itens. An 
-exar^^le of a specific instruction " fs : . 

/At tfps station you will be tested on your ability 
to perfora certain iuasfcs on ,the -breeth oechaniso. J 



these otII require you to^parfcm th^ duti^ . . ^ 

of se^^erai Gmmeers/ Vqy have five kiniites fcr • f . 'J^ 

est* perfcrnancs jieas:;/!^. "To:; vill respond apprc^rrir- 

ately w!>2n instnaitsd. Using tlje breedibloc* bDlding -^^'•^^ 

toqls^and the eye bolts s^jpplied, follow eai± ii^^y^ac-^ 

tion e^roir^r gives yoj- Yoa cctsr-respDnd to eac6 ' ^ ^ 

instruction correctly in onier to pass the ^lerfcrnas?^ ' ' 

jieasare.*' * / 



The adnrinistra tor's sp^fic iiistructions for this its^m^A^i^ 
cVj6a yfr^t tx>o1s and eye bolts to assenole, how to place tlsera et tJie 
station, and Khat instructions to give to the trainees. 



?*eiienber, an >teni is incoiiplete witho:;t neces^ry J 
specific Instractions. 3 



* After creating the itens and tJjeir associated specific instnic^36ns>* 
•you should assess their adequacy. Let'p review sorae of ^e regairaia^ts 
for adequate items. . ^' 



.ASSESSiKS ADE{?UACT OF ITEfC ^ ' • . . 

/" • ~ 

Do Itens Katdi Objectives? - . ' ^ ^ ; ' 

First, you should ensure tSiat items Jiettch objectives.. Che4Jt:t^' 
following in both tiie item and the objectivB to be sctfe.tbgy arfe;the saise: . 

♦ Perfoniances ^ .-^-^-v f^i 

^ * ' * r * r. » ./ fit 

' ^ • Standards . 



• Conditions '/^J / v ^.f.'^ '^ ^ ' 



Then, find th^ overt min intent x)r ii»|4Ca$op»iB t^ 
perfomance should be the sane for each" it^.^u. w^ote/ft^ 
'Do they jnatch? If the answer is ye^s prtH^^ to,^^:j^j^t/z^^ the.:^ ' 
answer is no, the item should be revised'^t^-i^e^^^ ; ^' "^'^ Z-/ - f 



Third, note the standards in thevc6i$cii^€-.^i^^ 
the standards in each itea erf the ^c^^ fot. t^^^^ ^^/^y 
do oot^ th^ itea should be^revisejf ^ rejectj^ v^ff/^H^flW^^ 
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Fourth, eosure tfeat tie ^^liditions'of tn? objective iratdi tJjose of 
itfisi. '^f natdia tfee itsju is satcessfiiK if t^iey do not, the 



Otner Checks on jtero AdeQiiacy 

Yoa sb©yl<3 also' ens^ure that all itens ame clear and iinanbiguDus. 

iljers should fee no question as to njfcat is neani. If ym are not certain 

^tfTJt any of your itens or tf you think ^.at tS^sy csji be tak^ iiore thao 
one i^y, see if they can be inproved by revision. 

» You should also take into account whetij^r or not the items are 
r^sonably easy to adininister. An item should not be any joore difficult 
to administer than is necessary while adequately jiatching the objective- 
Itens that are coiiplex to administer will be subject to additional error, 
both, on the part of the test administrator and the trainees. For exaniple, 
if yow item is intended to ^sess begitming soldering skills, yoy i^uld 
not Kant it to involve soldering si^icromini^tare components, to a circuit 
board* Such an item would be difficult to administer fe«2fase of .the 
necessity of guarding agair^t damaging exi^nsive conponsiits, and because 
of the difficulty of observing the sold^d connections. Instead, your 
itgEi sfiouid probably involve soldering najor components to a lar^e chassis 
(or sometting similar which is more easily adminis tered) ^ . The poin t is, 
not only, should your Items' be feasible (be fnihiiTthe lijmtts of practical 
constraints), they should also be relatively easy to administer. 

You have stated in your test plan worksheet the level of fidelity as 
dictat^^by the test format. You should ch^k now that your items are 
at the appropriate level. If your objective calls for hands-on performance, 
then youc test plan worksheet should so specify* You jnust be sure that 
your it^ call for the same kind of *hands-on performance. 

Keep in mind that^the higher the level of fidelity, the better the 
test. But remember, too, that the lev6l of fidelity specified in the * 
test plan tm/st be adhered to, since it was based not only on the objective 
but. also on practical constraints. {Practical co'nstraints nay have prer 
vented higher levels pf fidelity irfjich would otherwise 'have been possible.) 

When you revise inadequate itcns, be sure te revise their specific 
instructions also. ' • . 

How you have a pool of items and their associated specific ins tractions 
which appear adequate. All that regains is to develop general test instruc- 
tions for your CRT. ' ^ 

•90 
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mnQ? TEST UiSrmJiB& V 



Proper instruztions are an essential part of any test. Vou should 
try to nake isstnictions as clear, unanfeibuDus, and brief as possible— 
boti general instructions given prior to lest^ itnd ^specific ipstruc- 
tions limediat^ly pr^eding tJie iteias to which l^ey applyr 'GeHeraT 
instrDctions apply to the aitire test, unlike specific instmct^ns yA\ich 
- apsly_£miY to £srtajji iteus- - 

* fiener^j instractic^s for any test should include the follovring types, 
of infoniBtion: 

• The purpose of the test . Fpr example, 

'"'This is a test of your ability to dissssenfele a 
K-60 machioe gun**; , - ^ ' . » 

.This is a test of yoOr^ability to unscr^nfele code 
words*'; - 

""This is attest of your knowledge t)f trafHc regulations*; 
etci ' ' ' . ' 

^ nine litnits for the test . For exainpre, 

''You have 60 niinutes to a>n?>lete this test**; * ^ 

^ ' \ 

. ' . *'You Jjay€''4p minutes to coitplete Part A of this test, * 

30 ininutes tQ coinpTete Part B, and 4-5 minutes to coinplete 
Parte**; v . ' 

''You should be able to cosplete this .test in ^out one 
hour. Take your time, yoy will be allowed to finish if; 
it takes you lo n ger " ; ^ — '• — ' • 

ett. . ^ ■ 
■ .- ' ' ■ 

• Oescrii?tion of test conditions . For exampla, ' 

It "You will be allowed to^ use your textbobks".r* 

=■ >. ^You will be tested in a tent filled with Oj tear gas"; 

"You nay use any of the tools on the table in front 'of you"; 

■etc. ' ' . . • 
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• Oescription of test sjaniiard£> . for example, ' • 

• ^ J 

*You will He scored\on how uany items you conplete 

^Ycii wi11*be scared on your*abi1ity to follow the SO? 
for doing this task"; 

. -You wrll be rated as to the snootbne^s of your landijjg*; 

* 

/*To receive credit, you itbsz get the exact ansii^er for : 
each probleni"; 

etc. . 

♦ Description of test items / For exainple, 

"For eacA problem, record. tout answer to the nearest 

* , ' tent*. Show your calculations*; 

**Troubleshoot each s^liiinction listed and reojrd the 
part to be replaced/ 9o one at a tine', continuing until 
you have diagnosed each ral function listed''; 

"Circle the letter indicating the correct choice; A, 

etc. - ^ * . 

Note: If the test ?s A written one, it is a good i^H^ 
; ^' include a- sangjle itga with the correct answer. One 
. sanjple itera is worth nany words of Instructions. 

^ General test regulations^ For exanple^ 

. "4)0 not talk to anyone—talking will cause you to fail 
the test"; . ' * 

""Raise your hand if you j?eed assistance"; \ 

; -/ "Proceed to the .next station when you have "finished . 
the task*"; ' ^ - " 
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SaSCTlKS niiAL TEST ITEMS 
Tne preening chapters have described how to construct .test 1tCT% 




avon hem effective!/ -each |€®b discriniinates betw^n those, have achieved 
tjie objective and^those nio have nojb* In addition,,.gooi items^jl oot 
confuse trainees, and win pass revpeis by peers and experts* The seqoence 
of operations for selecting final test itei^ from the item pool is shown 
in Figure. 5rl (foldout at the er^d of tiiis chapter) • . • 

• - - i 

; !n ord&i to select finall test iteins, you will need a pool of atout 
twice as ni^ny potential items as are requir^ faf the final version of your 
CHT* Ydu have already? checked eadi iteirrto inake sure that it natches its 
objective, and thart, the item is clear, unanibiguous, rastsmably.easy to 
atjainister, and at the appropriate level' of fi<5e1ity. ' 

Even after soch careful re-examination, it is iii^wrtant to try out the 
itenis. It is through tryout that problems which you cannot anticipate jiay 
beconie apparent. In this^chajptfer, we nill discuss how to donduct an item 
trj-out and"^w to use the rejults. In addition, we.-will discus's other 
-ways 'of reviewing test itefris, to help you -select the best ones for the final 
version, of a test^ The enSTproduct will b? a final vers^gn of a^BT which* 
4s ready to administer. n ^ ^ - \ 



'TRYIIiG Ol/f:TKE ITEM POOL ^' ^ 
Selecting A Sample 4 • ' . ^ » . -/ ' „ 

^ ' ' *• ' : - '} " . ■ ■ -' • 

The sample of persons you use \x^try out the test iteins laust^incl udfe : 
^%OTis, who are similar to those fo'r whom the test is intended/ Here i^e ^ 
raust keep in mind the pun^ose of the. iteris-jrto diff^entiate between those ' 
who have the JctowI edges and skilly to reach the objectives on whi(;h the . 
items are basec^, and those who do not. So, about half o^ybur sample 
should be composed of people who are "coasters"— that is, people who ha^e 
already passed the course segment that your jtem pool is .testing,, or those 
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v^o'are knora to be competent ^ the subject ratter area,* sueh^as' ins true- * 
tor^V or others #Ao. are a^^dy known to be qualified. ^e'otSier half 
sfigjuld be coinposj^ of people. who are* taking, or are likely to takel fcHe 
instnjcticnal raterial for which y6u are deYelo^ir>g a test, but Kh3 have . 
not yet oass^ the course^ uoit, or lesson in questioo > The second hal^qf 
your sainple, ths^,. should be Qong>osed of peooffe who will ^ taking tife CST, 
but >i!ho are expk:ted to fee **non-i3Sters'' (sfnc§ they have not ye't hai the 
^propriate training). .Thus, about half of your sainple ^v&y.be expected to 
on the items in your itaa pool d<hiie the other half should not. . 

Suppose you ted ^veloped aij item pool for in'di'ylcJuals vho have (Sjid- 
pietsd the irsdividual tactical training cocnponent of 3CT. rno would you 
try.. out yoar item pool on? . Bali^ of your .sanple should have people wto have 
already befen trained^snd' tested 90 this, ajinponent of BCT. The other h^ If 
should^fae con5)ose*oT individuals who are -jn 8CT out vbo have not yet been 
trained in the individual tactical training component. 
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Suppose your test is intende^J' for ejcperiajceddntellfqence specialists 
jgain, half your sample should ^je. cftnposfed of such specialists; but the other 
J3ai/_ should -be conposed of people irfio haveJjeen trained.- as intfelligence 
specialists, an? who are hot yet experienced. It would be- inapproprJate 
to use people who have not receive^! any trainino as fntel licence specialists 

identify tisose intell^qence specialists 
who have ted expei^ience, fronj those intellioenpe specialists «ho have not. 



Try out your iteni pool on the sane type of people as those who 
-WHJ take the final version «f your test. Half the people in 
your -tryoVt saniple should be 'laasters,* and^ the other half 
should be "non-masters.** . - - - • 



If yoirr test will be given to several "different grolips, you should 
t^-out the-item pooT on samples of »1nasters" an^ "nol^haSer^" froiS each 
group. > ' • • 



Sacple Size 
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Th^ number of individuals to include irt your tryout sample' nust be \ 
rifff-rfnlt^f- consideration, fncludinq" too'nany is rarely a probleTi; th^ • 
difficulty lies in detennming the minimuni niBiiber of people necessaU for 
the tryout. There are two factors to consijder in.naking this*deteroWion: 

• The nun6er of iten^^ in your pool 

• The size of the population^ for whonr the;test is intended 



Ihe-nunfeer of items in your itesi pai3 is the mst critical factor. 
.Toy mst bavs jrore people -in your&ainple tljan iteife in yocr trycut pool. 
'Othenrfse y&j won't be able to use the t^^ut results properly. 



In general^ you should have at least '50 percent nore people 
ii>your seinple than iterps In your pool. 



For exsTsple, if there are twelve itesT? in your tfyout pool, ^otHsTTT need 
a saTHple of at least 18 people {nine "jiasters* and nine ^mi^^sXirs^) . 
If possible, it is better to have a^^^en larger tryout sacale, . 



The greater the proportion of people^ in your sairp^e to items 
iay^ur pool, the inore likely it is tha.t your itenj analysis 
results will be reliable. t 



The seojnd factor to'«nsider in deterof ning ,the si^e of, the tryout 
saniple is tJtie size of th^populatlon for wboni the test , is ijitended. The 
pirincipJe here . is: . . ^ • ' , * 



The tryout saii^le size sfiould be proportionally related to 
the size of the population for which Me test is intended. 



That^isi tMe larger the size of the population for irfiich the test is 
intends, the larger the tryput -samgl^ should, be. 

To be- representative, a saniple should have enough people to. reflect 
the composition of the test population. There are no' set rules for relating 
the sarople siie to the size of the test population, but Figure B^2 provides 
son»^ guidelines. , - ' 



If yaatL test Kill bs atirfnistered 
to about ttis aasy people duHng one 
J cycle: 


the frjn£>er of people In j^'Dirr \ 
tryout senpla sh:>:ild be atcut: 


20 cr less 
30 • 

50 i 
100 

• 

f Aft * 

1,009 or nore 


12 to 15 ^ • ; 

i5*to a> 

* ' 25 ta 30 . 

4^0 to 50 . ' 

r 

7n in ^ * ' 

• 

80 to 110 


Figure 5-2: SuidellTies for Giooslng Ssnple^i^fe^ 



ff'Ilbg pj?-jli*1trT? for wbora tiie test Is 4nten<led<ls snail, tJie san^le 
size can also be sralTand still be effect1ve> for snail populations, 
tbe^saniple size is rore likely to be set by the nunfeer of iteas in tte itea 
pool. 'For ex2n?)1e, if the population for. a. specific OTT in one adrainis- , 
tr^tion will be about M people, you can see from Figure 5-2 that eight 
people will be enough for the saE?)le (you would actually select four 
"casters" and four •non-nasters*) . But if your .fest will have about -six, 
items, then your item pool will have about 12 itesis. Thus your sacple 
should have at. least 18 individuals (nunbfer of iteis in pool plus fifty 
percent), » . - ' • • 

^ _ ^ — , y 

If tha test population Is large, the sample -size will be deteroined 
nore- by the size of the population tJian by the nirnbei: of itenis in the t^f . 
R^Dester that the number of iteos is the rest critioit factor^ So, never 
use less than 50 percent core people than f tens^ ^^yen if the saiaple could be 
smaller based en^ the pojyjlation size.- - . ^ ' 

. . 

There -is one other iE5xn-tant point in selecting a sanple tJiat will 
be representative of the test population: 



The tryout.sassple oust be rar*dtfn. 



97 



This ineans that the irrfividjials cbos^frcmi airong all avaiUble people of 
tfe^ appropriate type should be selec^g chance > If j'Dj use a randoi? 
sanple, y^u will have the best representatlcn of the test ^^jlatior^. 



It /is very sinple to construct a r^ndorti sanple. First, obtain Vno 
lists of the appropriate types pf people ('Tasters* ar#d *^n^r>-rasters*} 
available for the tryo^it. Write ike names of the Masters* on separate 
slips of faper and place the slips ir\ a helinet. Shuffle the slips 
t?K)roughly and, without 1001:109, pull slips out of the helinet. IChen you 
have pi^lled out as may slips as n^ded for 'the *^sters" half of the 
sanple, keep these and thrv^f the rest avey. Th^n, take slips for rhe *nDn- 
nastsr^s* ar^ repeat the pmcess, er#d1t5g up the necessary nunfoer of 
•*non-iiasters.* You vrill then have a ramSoiB sample of the appropriate 
nunber of "rasters** and ''non-rasters.* 



Let's consider an example of de^nnining a tryout sample. A very 
likely sample could be students vfho are about to start 3 training cycle. 
One groi^p could be pretested (that is^ tested before training) and called 
*non-rasters." The sea>nd group, could be posttested {tested after 
training and cabled ^rasters.** 



Deternrinatlbn of Test Tryout Samples: Illustrative Probleni 

The test Is te be five items In length. The course cycle has 50 
people, tetermlne the-nt^a^of people to include in the test tryout 
and the nuirbef^of itess. Assume you will usd students In a current train- 
ing cycle to develop the test for the next cycle. 



Sol uti on : 



\. A five-item test requires lO^tsns for the tryout pool. 

2. A 10-1tera pool requires a minitnuni of 15 people In the 
^ test sanple. 

3. fifty people In the course cycles calls for 15 to 20 
people iil the tr^ut. 

4. Rai^domly sej«:t a ninfnun sair?)le of 16 people for the 
tryout, since the saiae ^^ciber of people should be .In 
each group. 

5. Ranjlonly divide the 1§ Into tw groups of eight each. 
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s Adnrfrtister the W-item p»l eight ^fc.n-nasters* 
. before training begins- 

5 7. Adarioister the fO-1tem fOol to ei^ht **iiastsrs** after 
tte training cycle is conpleted. 



Cbodacting a Trytrjft - * 

Ifow that ysxi have sel€?cted a »in?1e, ycru are r«dy to cor^dact a try- 
cut of the IteB pDoK Pie tryojt sic;ild be administered in a staMardized 
fashion, jast as if you were givim tJ>e. final version of th^ test. (See 
Chapter 6 for a stalled presentation of to adninlster and sore testsO 
The itCT pool ssei in the tr you t Is Utetj; to tste twice as long for a^ 
stsadent to conplete as ioll the final verslo?^ fef the test, sloce It ajnta^^ns 
about twice as nany itesis. 



Here are sorne conditions you should stablish dijrlng the tryout of 
the item pool: 

• if possible, have som^ne^else a^'nlster t3be Itenj pool tryouti 
so you can be free to obsen^e the process and note on^blens- 

• Individuals in the sainple should be infonn^ that they are serving 
in e tryout to help develop a test. They should be asked to 
nafce notes of confusing or anfeiguous items, and of anything 

they don't ur^derstar^d. 

' •Essentially the saine instructions that will be used Kith the 
final version of the tesi? should be used. It ray not be possible 
to irake these instructions exactly the saire, since the test in- 
structions may be modi fled tesed on feedback from tt^e tryout. 
Certain test l^ens nay be eliminated by the tryout and subsequent 
review, so instructions associated with theniwill also be eliniinated. 

• The t ryosjt fs also tised to evaluate the instructions: Lack of 
clarity, aii4>iguity, etc. stould be noted by individuals in the 
tryout sanpTe, and the instructions iinproved. (It is liuportant 

to test for knowledge and skill in the areas covered, rather than 
^or understanding of test directions!) Also, resu^eh^to give 
everyone in the sainple the ^ame instriictions~«iis is in?)ortant for 
S^tandardization. 

• Test conditions should be the sane for the tryout as they will be 
in the final version of the test. Do mt try to short-put the 
specified conditions as this will affect your tryout results. For 
exan:ple, if iteos require. the use of a 250 foot hi^ Svcp tower for 
parachiitist training, use that tower, not a 40 ft>ot high jtxqj 
platfora. If a test itea calls for outside adninistration. give ft 
outdoors, not inside^ 
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• t^c^itaa sSiDuld be aArfrrfste^ jyst it ndU be Ixj ate test 

' V ItSetV. I^iis ineans, for es^mple, test if it requires tirree tet 
afdarinistratcrs to adnrintster 't2^ f1?>l3 fom of the tot, yo^;j 
steuld also use tfcree-^t ednrffjisjb^tors In trysot. 

• Tst starKferds should be t*)e Jft tie tryojt as In thk final 

^ versim'of the test. Yea iicst fee;csrefii1 to score the Itens for 
' the people in the Uyo'Jt ex^cpiy^%^ yoj iriji for tte firal version 
t)f fee tst. • * / ijT , 

Tbe tryoat sSsyld be con&ctej.^^y^ljr as if it *#er« tee final versloa 
of tee test, ee sare ta atfaiinlstej'.^je tryoixt in eisctly tee sse «a/ teat 
tee test will be given. 



Coni'jctim An Itgn Analysis On Ttfe trysat Resiilts " 

> ♦ . • * 

There are a nmb&r of tettm^^pes tfcat can be used to help sjqt'bad 
Itejas. -►All lake 4ise of the.fbll?^vrtng prfrciple: / 



Acc^tabje items discrl^irffate betv^een '^festers* and ^Kon- 
Kasters.* Unacceptable it^sis are incapable jof naking such 
a discnirf nation. 



One simple and irldely used item analysis technique lakes use of a 
statistic called a Phi coefficient for short). The data required to 
Cfse are: 

• Which people who faU an iten are **lfesters* and irfiich itho 
fail it are •'Son-Masters." 

• Which people who pass an'itera are Asters* and inhich «bo 
^ss it arg "Ron-Kasters.* 

If ^ teve these four bits of data availably yo4i can calculate the valoe 
of ^ for ach item. 



Calculating ^ • # 

L€t*5 look at an exa^le of calculating Suppose you have planned 
to have foui* itens in your test. Yosi have baiU an iteo pool consisting 
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eight itezis^ Tod ^M^fn ^ proper ssnple consisting of 12 irrfividisals 
{12 = SD perDesit JX^re tfen the nynfcer of ilens , and tie pop'Aativn for 
i^&DiB the test is intended Is fairly snail). Figizre 5-3 ^cjis the rs^ztr^^ 
of yojr too^t-i . . ' . 



&ca1i ttet it vas suggested ^rlier in tfiis <5a5rter, that approxi- 
3:aie3; Silf of the people in yoirr tryout sanple stouW *i^ters* (that 
is, p^ple vna h2r>^^ alreat^ completed tbe training segtDsri th^t jour CST is 
beaog' developed to telt^. or experissicad people are actaojfledged ^*casters* 
in tie area tested). Tb$ o^t^rjHlf should be people ^^Doni you yould roiT' 
expect to N ^nasters" {ihatfsTlS^pTe irJ^s- efe aet ©ec^sarily kconleicer 
aile in the subject ratter beiog tested, cr ftave oat had the apj^priate 
training)* 



Trainee 


"ttester" 
or 

tlisp-Kaster" 


\ 


Item fdBBlber * 
2 3 4 5 


6 


7 


8 


l^fli6er of 
*!tens 
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? 


F 


P 


P 


P 


6 


T4 . • 


H 


P 


P 


F 


?■ 


P 


^ 
r 


P 


^ 
r 


5 


■ T5 


K . 


P 


F 


? 


P 


F 


P 


P 


P 


6 - . 
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F 




F ' 
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3 
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m / 


F 


F 


P 


F 


F 


'F 


F 


F ; 


1 


■ RtK^ftr Passed - testers 


5 


4 


5 


6' 


4 


3 


4. 


4 


35 ' 


Kiciijer Passed - Non-te$t€rs 


3 


2 


3 


3 




3 


2 




Total Nunter Passed 


8 


6 


8 


S 


6 


4 


7 




54 ■ ~ ^ 



P = pass the itsa; f = fail the <te:a 



f\^Ye^5-^. Results of Itea Tejmi^ 
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Itow, Tet's amyjte the t coefficient for tee Itsns In Flg^^ 5-3. X 
Lcci at ItCT fer Iters 4^ we need: 

1 . The njd>er of ^rasters* whD ga^^e tie o^rrect 
asswer to Itesa 4, ^ 

2^Tl5e rrjnfeer of ^rasters* ^ ^e a wrong answer 
to Itesa 4* 

3. T^e nanfoer of •nDrnuasters*' gave IhB aivroct 
< answrer to Itesa 4.^ ^ 

4. - Tihe mirber of *fon-^rasters* a wrong • • 

anssirer to Itsnj 4* 

^\gaTe S--4 Is a ratrix %*ich helps organize data to sinpltfy conpu- 
laticn of ^. Let's put t*)^ daja for Item 4 into the natrix in Figure S-A. 









Iteni 4 










Fall 


Pass 




•i 


festers 


3 

0- : 


K 

n 

6 


€ ■ 






U 

3 .; 


C 

3 


C-H) 
6 




Ibtals 




3 




12 ' 

• 




Figure 5-4 


. Organization of Tryout Results 
For Coniputing ^ for Iteni 4 





In the.iipper right rargin you write th^ total of A+^B--the total rmb^r 
of "casters!* The lower right rBrgin then is^filled in to. show the 

totai number of people in ^e "nOT^naster" group. The bottom left nargin 
(8+0) shows how rany people failed the item, while toe bottom ri^t laargin's 
total (A+C) stows how rany passed the item. The iiar^inal totals (both the 
right nargin and the bottora rargin) oust equal the total ninfeer of people 
in the tryout sainple. 



~v ft is important to set up this ratrix exactly as shown in Figure 5-4 • 
The ^) technlgtie will not work/correctly if you don't. 

Figure 5-5 shows iten/test ratrices filled out for each iten shown 
in the tryout results pr^ented in Figure* 5-3. Coapare Figure 5-5 to 
Figure 5-3 to see hw the ratrices in Figure 5-r5 were filled In, 
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FIgtirf 5^5; Ita^/^t mrijxs Fille^In For Tfie Tryout 
, - Jiesults Sbdsn In Ffgarie 5^3 
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fiDw, yoj ira ready to calculate the value of * for each itesn. 
figure 5-5 shows the fomul^ for calculatirg 

i 



AD-SC 



T^jat Is: the nixmerator of c equals the value of cell A 
iiultiplied by cell D irinas tte value of cell 3 multiplie* 
by cell C. deooHrinator of ^ is the square rt>ot of 
the marginal totals multiplied togetijer. ^ of ccR?rse» 
Is the nuneratGT divided by the deoOTinator. 



Figure Formula for ^ 



Itow lefs calculate c for Iten? #1. lj>akir>g at Iters #1 1^ Figure 5-5, you 
fire} the follovnng values: 



A 




5 


3 




1 


C 




3 


0 




3 


A+3 






A+C 




8 






4 


C+0 




6 
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Substituting these values in the formula shown in Figure 5-6, you get: 

5x3-1x3 



* for Item #1 = ^ (6){6)(8){4) 

12 



V 1152 

.12 
34 ■ 

.35 



Sioilarly, 



^ for Itea #4 _' 



6x3-0x3 



\/(6)(6)(9)(3)^ 
18 

18 
31 

.58 ' 
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itote: Appendix 0 shows, the square roots of all ntrAers from 
I to 1,000. Yoti^n use this tablet© help in your 
calculation of ^. ' - 



The range of valu^ of ^ goes from -1.00 through zero to +1«00. 
Thi value jjf ^ for a specific calculation my be anyvdiere in that range. 
I^.gure 5^7 shovfs the range of yalyes of 4.. 



• .Val 085^4 say fall anywhere along this continuinj. 

+.io 

; J 1 « « 1 lJ 1 L_| t 

- -1.00 r.75 -.50 --.25 0 +.25 +.50 +.75 +1.00 

7 r-N.' : ^ ^ ^ 

"Warning Flag" Acceptable Item Values 



Figure '5-7. Range of Values of ^ 



The values of for all eTght items in the tryout are shown in 
Figure 5-8. , 



Itetri 




1 


.35 


. 2 - 


.33 


•3 ' 


.35 ' 


4 


j» .58 - 


5 


.33 . 


6 


.35 


7 . . 


.17 ' • 


8 ' 


.33 


Figure 5-8. Values of for Items* in Tryout Sample 



'If the value of ^ is less than '+.30 or is negative, the 
, Item may be a poor one. Regard values rang^ing from +.30 
to -1.00 as "Hamjjig Flags" that something may beWongf 
with the item. ^ ' 



A value less than +.30 means that therit^a-does not dlscn'rainate very well 
between how roasters an4 rion-masters A negative value (-.55, for 
exanqjle) means that non-masters do b$ter on the Item than masters. 



The values of i} for the eight i^enB suggest that Item 4 is the best, 
followed by Items 1, 3, and 6 and then Items 2, 5,, and 8. Item 7 in the 
example may be a poor item. Take a close Took at this item before deciding 
to use it. (Your tryout sample may have been, poor, or there may have ieeri 
something wrong with th^ administration of the tryout, etc.). You sliould 
alw.ays. regard an item wfth a coefficient ranging from -1.00 to +.3Q with 
caution— something may be wrong with the iten. ft value of greater than 
+.30 indicates that the item is a candidate for inclusion in the test. 



Summary of Using <;> in Item Analysis 



1. >Js besrt used when items are scored pass-fail, go - no-go, 

• accVptable-unacceptable,, or 1-0, and when there are about the 
same number of persons in the "Masters" and "Non-Masters^, groups, 

2. To compute <) for an item, determine: 



A. 
B. 
C. 
D. 



V r 



Kow many "Masters" passed the -item. 
How many "Masters" failed-the item. 
How many "Non-Masters" passed the item. 
How many "Non-Masters" failed the itan. 



Fill \j^J[iiisXr^^<irmi\^x\ determined above in a table such aS. 
fhisWSfne (and make the additians indicated in the right and 
bottom margins of the table): 

Item 





Fail 


Pass 




"Masters" 


B 


• A 


• A+B 


* . "Non-Masters" 


. D 

> 


C 


C+D 




B+D 


' A+C 





4 4 
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X. calculate 4)y substituting the values frm the table into 
tlris formula: - ! 

. 'ad-bc ■ 

^ ^>/{A+B}{C+D)(A+C)(B+D) 

V ^ ' I D • - 

5. If the value of <j for an itefn ranges from +,30. to -1,00, consider 
. - it a ^'Warning Flag" for that^ ^tan: Pay careful attention to the 
J tern beqausB* it may be a poo * ^ne— ft is often better to" throw 
out that Item, -develop a nevy pne and try it out, ^ 



Other' Joints About Item Analysis 

4> may be used for conducting an item analysis of almost any CRT item 

pool. It IS the technique of choice when the items cire scored "pass-fail" 
.... I ........ 



or "go - no- 



go,". However* ^ can also'be used when individual test items 
S)*e given point values. In such casesl it is necessary to set a "pass-fail" 
cut-off score for each item. 1 



There ^e other re-lated statistical measures which are more\ appropri- 
ate in other situations and Scoring arrangCTents. These will be found in ' 
fflos-t stanaard books on elementary, statistics,* ' ' , 

The ^ technique desdribed here Js. the recommended technique for com- 
puting item analyses. You should be aware, however, that if you have a 
very small sample,- say less thaa 8 peopile (4 "Masters" and 4 "Non-Masters"), 
4> may npt be appropriate. In such a cajse, you will have to, resort to a 
more simple (and less accurate) technique.. 



Item Analysis by Inspection ^ \ : * . ' * 

If you have less than 8 observatioris, ^ is inappropriate/ In. such a - 
case, simply examine 'the numbers of "Masters" and "Non-MasteVs" who 
answered each item correctly. " A rough interpretation about item selection 
can^be made on the basis of judgments about these pumbers relative, to each- 
other, . ; • , . ; ' . f 



For example: Guilford, ij. P. Fu^amental Statistics in Psychol ogy and 
, arfd Education, ^ New York; McGraw-Hi]I,|1965. ' ^' ^ ^ 



look' at the .data, in Figure 5-3 for example. (Although, we have more 
than 8 cases f)^e, we can use this data to describe the procedure whicH is 
appropriate for small samples.) Thfe best item seems to be number 4, with 
6 "Masters" and 3 "Ron-Masters" giving the corr^t answer^ Items 1 and 3 

. look like the next best. Five oujt of 6 "Hasters^'^'pissed these items,' 
while 3 out of 6 "Non-Masters" gave the right answer. The fourth best items 
are 2, $, or 8, These are marginal with only 4 out of the 6 "Masters" 

, giving-correct answers. Among these^ the best choice would be that one 
which best rounds out the coverage of the selected items. Items 6 and 7 
are the- poorest of the lot. Only half Of the "MastersVgave right answers 
to Item 6, It will need to be discarded or revised so more "Masters" will 
answer it correctly. There m^iy be an unusual word or phrase in it which . 
acts as a stumbling block. It. may be necessary to create a new item to 
cover that objective. Item 7 shows too little discrimination. between 
"Masters" and "Non-Masters," 



*you can see that these results correspond quite closely with the 
results of the^.^alculations discussed earlier. Remerfiber, the.^ tecl 
is preferreff^ * - 



nque 




You should only use the inspection method if you have 
less than 8 i)ersons in your tryout. 



Cautions on Use of Item Analysis Tediniques 



There are a number of cautions that you should bear in mind when 
using item analysis techniques on CRT iteilfi pool tryout results^. These 
include the following: 

1. An item analysis will only serve to warn you which items may 
be inappropriate for the final version of a test. It will not . 
teU you which items are necessarily good. A low or negative 
<}> does not mean that an item is definitely bad— it just means 
that you should consider it carefully before- Including i«t in 
your test, • , 

2. 1 Use the most appropriate item analysis technique that your data 
' V will permit, ({i is the technique of choice unless your sample 
size is very small. 

3. 5ome items may be "chained together" on certain tests. . That 
'IS, they may all be a part of one performance measure. For 
. example, a CRT on the .disassembly of a specific, weapon may have-' 
10 steps, each of which>1s treated as an item and' is scored^, 
go - no-go.. Each of these steps must be. completed in^.turn for 
X the weapon ,to be adequately disassembled. But— if all steps 
■ • . are relatively difficult to perform (that fs, some people fj^^l 



then, ffiid sorae people faSs thsas) except for steps 3 and 4 
ii^ifdi are very ^sy, ani whfrf] frvferyone passes, an Item 
aialysls *oald indicate that Itaas 3 and "4 have a very low 
value— probably aro-jtnd zero. Tbkt is> !ten5 3 and 4 in this 
case, do not discrintirate vaU bdb^en "testers* and *Jton- 
Ifesters." TIjus, lave a "Ifemitig flag* for each of these • 
two Itess. Byt, jrcu 'cannot throw Wt t^^e itesns, since they 
are necessary st^ in the disassen^ly of the ^pon. 

Khenever you have itens that are *^3ir#Bd togetfeer* sych as 
Iten5"3 and 4 in this e)aii5)le,,you w?11 i»ot '^ able to thrm 
some of the itens cat and keep otfceri. Yon kUl either have 
to tfercw-then)--a?4-4J5itor keep thss a^ . - 




REVIEWIJJS REyAIKIIiS WST UEKS \^ - 

So far we have discussed only one vay of. selectif^^l test ^tsns; 
thi use of ftea analysis techniques- Since iteia analysis ^wRl qnjy ppch- 
vide "teming Flags" concerning itens v^^iich nay be p∨- yoo-^y. resjtnre 
additional ways of judging itens. Renien2)?r, sinc^ yoy'fave er^iedlan 
itea pool of about twice as nany itsns as your final test Vegulnes, yj^te 
gq^V is to choose the best items for the fin^f vefsion of^yoyt -test.-' It 
is not necessary to eliminate exactly half of tfri items ic your 'pool , . 
since you. can alvays use extra iteas to mke alternate fonns of t]ig te^t. 

There" are several ways in which you^^n^r^yTew'itesns ;n the itera. 
pool as sufrplerents to the item araX^rsTThey ^re all £S"5ential]y sub- 
jective types of review and incltide: . * 

- * * # 



• Feedback from indjviduaU^'in the tr^ut sajnple * ^ 

• Peer review * * 

• Forral review by test evaluation mits^ \ 
Formal review by subject inatter experts 



\ 

Fe^back Froa Individuals in the Tryojt* Sample 



Feedback frcra the fnd1vidoals,in your.tryout siS?)1ecanbe extreneTy 
useful in helpii^you identify pipblea itens. Is discussed Jn .thfe section 
on atfaijijsterfTig the trv^t, stiidenU should ♦n-fte dowrt nfsuniterslandii^, 
gji.fc^ints of confus.ion, ana Hfcablgui ties noticed- during tfie'trjtart;. Tfou-cay 
«ah^ to use a worksheet, such as the one shoim in Figure 5-9, tfi» use in - 
-"^"Mir^ diffiqijl ties with- the tryout': ' • . . ' ' 
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Item # 


4» 

Did yoa cader- 
stasd-tfae 

for thJs Itens? 


Old ypu have 
enough tine 
tso do tiis 
item? 


Did under- 
stand \m yorj 
wDuld be 
scared CD 
tMs item? : 


Mere the eqjip- " 
nent and facili- 
ties for this 
item suitaile? 


1 
2 
3 
4 











Did yo-j have a^y difficulties irith tt3 general test instructicns? If so, 
n*;at were tSey? ^ , 



(Use as rajcfc space as necessary) 



Describe any difficulties you had with iteins.*^ 

(Use as JiTjch space as necessary) 



For each *nD* In the table aboye, describe jnfeat ti;e probt«a >es. 

« 

(Use as nrjch space as necessary) 



Any other c^nments \ri11 be appreciated. 

* * (Use as. much space as necessary) 



figure 5-9: Morksheet for Recordii^ Feedback From Tryout 



If you use stictv'T^rksheet, point out to the individuals ^ co::5>lete itr 
that their hoK^ feedback will help you £o iKJprove the test* Kate that 
the column headed •Did jo^j have enough tiise'tb do this iten?* is not rele- 
vant^ if you have itei^s vMch involve liiae r^uirements or production rate 
staiJdards. Thjs -OJlurai is intended to see if the indiyiduals have enou^ 
tine to complete itecs for which speed is not a part of the standard. 



If j:any individuals (oore than 20S of your fa:::ple3 
have difficQlties with the sszie itea(s>, the.iteafe) 
^>n^estion isay be poor» ^ - 



If you Jbvs been able to get eootJjer person to acfcally administer 
the tryoat fcr yoj so ttet yDa are free to observe, you sboald not* tho 
foiiowfig points donjg fee aAnloastraticn of t!je ti^ut: . . . 

• #Oid the trainees appsr foUcw the lostrections ^ily? {If 

trainees appeared «>nn2sei, nay lerrt to ask ftenj to repeat 
tfce imtn;c^ioj2S in ojci nerds* If tfc^ cap^t do tiis 

• ^de^jately^ i3ie a nate of the confusim instructian and revise 
it laterj • . 

• Kote questions asked by trainees. Yoa nay need- to revise your 
instructions to take care of questions midti cooe jup fr^pestly. 

• Sote problens'wit^ i^ilitits or eqixifm^t. Such prDblens Jiay 
include iial functioning equipiient, eqaipnent br&k^rts^ poor 
layoiit of faciliti^, bazarcls resulting froiD equipment or facili- 
ties, administrative difficulties in running trainees through the 
test on tl^ne, etc. 

• If different perfcnrance ireasures are taken at different ""test 
stations,** note if there are any back-ups or bottlenecks going 
f rpi3 station-to-station. 

• Jfote v^ather the test adujinistrator is able to atfeqiately oteerve 
^ the perfornaoce of each individual. Also check to see if the 

edjgtnistrator is inadvertently helping the trainees to do better 
aan t3i^ could do by ti^snselv^. 

• If you observe trainees ^laking tsistakes, talk with them to find 
out whether thfe niistake was due to a iHisunT^erstanding of the iters 

•^r to an inability to perform. 

You can use tiiis record of observations to help disajver poor itens. In 
addition, sooie observations my aid in improtiig instructions, facilities, 
equipnent, and. other conditions of administration*. 



It is a §ood idea to have several administrators score Gdb trainee 
independently. This is espiecially iiaportant if subjective rating scales 
are used. Tfote itsns which administrators,consistently score differently 
tiiese say be poor items. 



Peer Review' 



Anptter useful technique for eualt^ting iteos is to have peers reviefcf 
tiieo. These should be fellow instructors, fellow test d^elopers, efc^ 
Ask your peers to rei^iew your itea pool and to cake'notes of any itecs 
which they thirJ: should be revised or eliminated. 



Fcmal Ijeviar by Test Evaliation Units 

Anotiier important type of item revieyf is provided /i>y>4est ei^al^iaticn 
iihits. Tbese units range tfros: post edjcational aijyiscrs^ ao3 their staffs 
to CTtire gro:zps whose sol€ purpose is t^e ei^alaatlon of test neterials. 
The test' evali^tion unit will be especially good at idestif/ing Frobl^ns 
ynth itens that violate established testing principles, for exaiipte, they 
nay easily identify 2 tens that are *give awa^-s* or are too easj^. 



You should also give the test evaliiation unit a list of the objec- 
tives, along with your iten^ pool. They can then check to iiake sure that 
your items natch your objective. 



Fonral Revie^y by Subject Katter Experts 

Obtain 3 review of your iten' pool by subject natter experts. Since 
test evaluation trnits are often not experts on any particular subject 
natter (other than testing}, ycu should obtain a separate review py subject 
natter experts foP those tests on nhich you are not expert iJi the subject 
natter. 



A subject natter expert can nake sure th^t the contart of your iteins 
is acetate. Request thaUthe subject nattSer expert note any items which 
are confosing or njisleading* Remenfoer to give -the subject natter experts 
your objectives, also* 



REDUCIN13 THE HEK POOL 

Now that you have conpleted an itssn analysis and submitted your item 

pool to a review, you are ready to reduce the item pool into a final test. 

Your goal here is to end up with a final test which incorporates the. best 
items. 

* * « 

Figure 5-10 shows a simple way to sumnarize findings- about items, in 
the "item analysis" colymn, checJ: atv items getting a ^ from to -1.00. 
In the "tryout feed^ck* column, check the items wi^ which a significant 
proportion of the people in your sample (nore than 20%) had difficulty* 
Similarly, check the items which peers, test units, and subject natter ex- 
perts agree are. poor. 
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Figure S-W. Iteni Pool Revifew Sunnary Sheet 
{Check Itsns Identified as poor) 



Figure 5-11 shows a sairple Item Pool Review SinnnarJ Sbest filled cut 
for an iten? pool containing 10 items. Kotice that Items"!, 3, and 4 - 
appear to be okay^ neither the itm analysis, rjor feedback fraa tie try- 
out, nor any other form xrf itesu review found fault %rith these items* Itesn 
6 had a low t value, trjt since no other forni of .review found faalt with it, 
4t is protebly okay* Sinrilarly, Iteni 7 may be okay^ but you should check 
its structure- -the test evaluation unit my have suggestions for approval- 
Itenj 9 was found poor by all techniques except tryout feedback; *it should 
probably be eliininated. 

Item 2 nay have faulty structure since i^em aialysis and the test 
unit review found fault with it, and since it confused the people in the 
tryout s^le* Apparently its coverage of the subject niatter yfas agte^qr. 
prfate. Item 5, on the other hand^ iray have faulty content but acc^taferie 
structure. ^ ' 



Item 8 vas found f^lty only by the subject inatter experts. .Thus^ 
it Jiay haye a technical error. Itesn iO, though^ had a poor rating in the 
i^aa analysis, caused confusion to the tryout san^le, and los found faulty 
by the subject inatter experts. This item shoyld probably be elioinated. 



in suacary,^ Items K 3^ 4, artd 6 could be used in" the final version 
of your test irith no changes* Items 7 and 8 night be rade aggptaEle with 
slight iDodifications, jJrhile items 2 *and 5 wo&ld probably require gr&t^, 
efforts to cake them acc^Jtable* Items 9 and 10 should probably be elin- 
inated« * - 
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Figure S-11:. Ite^ Pool RevfeK Sutiniary Sheet with Sanple . 

Entries for a'lO-Iteni Pool ^.-^^""^ 
{Check Items Identified as Poory^^' 



The Item Pool Review Sumrary Sheet is just an aid to help you organire 
and consider the infornation you have collected atout the adequacy of your 
item pool. YourT own jodgrrent must still play a najor role, since you are 
J TOre faailiar with the itens than anyone. So, using the Sunnary Sheet as 
an aid to your own judgirent, you can decide which itesns are okay^ which 
need iriprovede^it (and vihat kirid of inprovertent), end which should be eTin>- . 
inated^ - , 



yhat To Do If You Elirsfnate Too Few Or Too rAny Iters 

Often you nay find that you have not been able to cut your ite^ pool 
in half, or, on the other hand, that you have had to elimnate too nanj^ 
^ itens. You don't really have a problem if. you haven't been able to elini- 

nate half the itecs in your itra pod. fact» you should be pleased— 
you have demonstrated your ability to create good ite:ns* Khat's aore, you 
now have a choice. Either eliminate i teas -by personal preferaice, or use 
the extra itens to create alternate forzs of your test* If ycu elinrfnate . 
iteas by personal preference, be sure that you follow yoiir tes$?plan. For 

>. 
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aj3np1e, yDJ nay fcavegl5r:5ed-2 12-3t5ni test with 4 objectives a^a 3 atans 
per objective, and a?t^ reducing ^^jr Sieni f^l , fini that ^j*:?:/ feave 13 - 
itsns idt& i(!ftridb to fialce the final version of yojr test* Se Si;re ttet ycrj 
feve 3 itera per cbjectlve, aft^ yo-j discard &e 6 extra ite^s. Iten*t 
idn3 cp iidti 6 Jtsis for 1 objertive asri 2 each for tba otner 3 objectives- 



If jtoii ose the extra itens to create slterr^ate fonns of ytrar ^^v, 
renenfcgr that alternate fern can stere itens in coixon. Suppose, for 
exauple* tiat yoa fsvsc^el luncated only 2 Hems froia an S-ltea pool, aid 
tJjat the final version of your test reqalr^ only 4 Itenjs, Flg-jre 5-12 
shows the fp^ible alternate fonns .of thB test ytrj can rake vlth tJie 6' 
liens, asscOTirg tbat^ ti^e itais are independent and ali'are relatesT to the 
sane objective. ^JfeJte that each of these fift^ ferns fcss at Ifeast 1 itea 
different pxm af>y other tons. EacS fcTnii, though, has at least half the ^ 
itsns in cbrocn ^fitb asjy other forni. Each form should be equally sociable 
as a final version of your t^t, (tote—there is no nee^ for sjk o^-erlap, 
it Just works out that way in this exanple- If you had enojgh Itens left, 
you couid create alternate test fonns witij no overlap, ijth nonoverHpping 
versions are called *j»rallel test forns.*) 



If you ellBinate too ?iany itens frdm youriteni fK>ol, and don't havs 
enou^ left for, the firal version of your test, yoy will have to cr&ite. 
new Itens. ^ * 
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Figure 5-T2. Alternate Test Forss Possible 

r-^ Four-Itea Test Hade Frca Six IteaS 
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If yoa mjst create new j'tens, you sh&u3d repeat entire t^ut itm 
amlysis and ' " . ^ - . - . 

t*e 

asing a ns¥ ssnple, try chljr tfce oew itei^s on your original ssnple, 'You 

then «j:^jte new item analysis va^jes for the nar ftesis* Then tgt 
feedback from the sanple on the new its^HS, and s^Atirit the ne^ it&s for 
reri&i by ycrjr peers ^ test evaluation unit, etc- 
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Figure 5-T- SeSiueoce <if C>f*r»tf<if» for 
Selecting f inat Test Itms 
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CHAPTER 6 



ADKIKISTERIKS ATO SCORIKS CRTs 

This chapte? will faniilialrize jpu irith procedures fcr administering 
arri scoring CRTs- Efficient and objective Jiethods of testing, accurate 
scoring, and fairness in interpretation of srares ^ essential in CR 
testing. Tfiis chapter wiU help you achieve these goals. 



COlfTRDLLIfiS THE TEST SITUATIOS 



Althollg^ the use of a CRT inplies that you are not interested in 
coiiparing tbfe perfomance of or^e person with another, it is still neces- 
sary thatStnteraction ^ng trainees in the testing situacion be prevented 
(unless, of course, the objective calls for the cooperation of two 'or core 
people). This siiT?)ly ineans that, in paper-and-p^^cil testing for exa:nple, 
persons should be seated a reasonable distance froQ one another and. within 
easy view of the supervisor; and that in group tests of perfamance, suf- 
ficient isolation shouldLexist tOL ^su'-^ thalAtnjgnjs c a nngt bolpi^ h inder, 
or observe one another. ^ ^ 

Whether testing is conducted individually or in groups, it is 
'essential that test administration conditions be as nearly identical 
as possible on all testing occasions. This is necessary for proper 
assessment, ^r example, students should not differ greatly in their 
degree of fatigue, hunger, or on any oth^r factor which could affect 
perft)rT[jance, The tester should also standardize his own behavior, his^ 
manner, and tone of speech when acbinistering CRTs. Figure 6-1 (fold- 
out at the end of this chapter) shows the sequence of operations for 
administering and storing CRTs. 



Controlling Environmental Variables 



When, administering CRTs, environmental conditions such as lighting,* 
tes:perature, and background rhise level, which night affect perfonnance, 
should be standardized for all persons t^ted. For example, if the test 
involves visual acuity, the surrounding lighting cust be vfery nearly the 
saae from test-to-tert. Conditions such as heat and humidity «n seri- 
ously affect human performance, so that, especially for objectives' 



ritqiiritig protonged effort and casocentration, groups tested at, 72° f . 

jBf§bt be expectei to ootperfcna equivalent groups tested at a huaad 
550 - • 



Jioraal^y, the condftfcjjs required for testing sfewld be stated io 
tie directions- It. is the responsibility of t3ie tester to ensure that 
l^iese conditions exist at Hie tiws of testing. 

... ... 

Controllim Personal Variables. \- 

Sts^dents should be tested tinder conditions cag>arable to tJjose 
e>g>er1enced by others lAo are test^. These include personal, physical, 
and erotiojHl conditions. It wuld not be ftlr, for Instance, to test 
one grtxip of studMte^ nanual dexterity in the Homing iniaedla'tely 
^ folloidng breakfast, and to give tbe san)e test to another group in tJfe 
Veyenifig after a day of strenuous f*ysical activity. Subjects cocplaining 
' of ninor'^illness cay be excused* and tested at a later tirae at tJje^dls- 
cretion of the te^ adainistrstor. . * ^ ^ ^ 



Instructions and Tester Variables 



. Instnictions isust be trnifona for all persons tesfed in order to ' 
oinicrize the possibility of cues and helpful, hints becoming available 
to srae persons and nQt to others,.- The standard test i/istructions 
should either be read, or recited from mebory. Sane typical and rep- 
resentative instructions for existing tests are shown ixh Figure 6-2. 

The responsibility for standardization of test adsrfnistration 
jmnditions rests with the test adalnistrator^ This, includes st^ndard- 
iation of your'owi behavior— the test administration .procedures Which 
you follow. For exaiEple, you are responsible for the proper tiiaing and 
teroiriation of the test. 



In Chapter 2 the lest designer >es asked to keep in isriiKf threes 
aain par^s of a good obiect^e: Perfornances, conditions, and sl^and^rds^ 
3fou, as test adnrfnigtrator, should al^ Jceep jthiese consents; in infndi 
It is your responsibitfiy to follow the .specific guidelines for a given 
i3?T. , 



stated Test Cfojectivje 



Instructiofts- 



1 . Placing tbe HBO racMnegtin 
Into operation and perfonslog 
iniKdiate action 



"At this situation you nust 
load the rtSO and engage 

a target at meters. 

You have three ninutes.* 



Oral 



2. Passage of obstacles at 
night and reaction to 
flares 



•At- ttis situation your unit 
is moving in the ara of * 
an enesy defensive position 
unier simulated night con- 
ditions. You must cross ' 
a vfire obstacle, a 'trench, 
and a danger area in order 
to reach your objective. 
Use nighttime techniques. 
Be -prepared to react to 
an aerial flare.* 



Oral 



3. Deaonstrate an ability to 
co!Eprehend vrritten Russian 
by reading Russian prose 
passages and ansviering 
questions concerning ttesa. 



"In your test booklet you 
will find three passages 
from Russian novels. Read 
each passage carefully, 
then answer the multiple 
choice questions following 
then. You may go back and 
reread parts of a passage 
if necessary. You have 30 
minutes to complete thU . 
test." 



Written 



Figure 6-2. Typical Test Instructions 



Kany objectives as written, are priaarily product, oHented. You 
should however, feel free to gather additional process inforaation if 
such information appeari to he useful .in an auxiliary *ay, and ^an be 
obtained without interfeHng with the performance of those taj^nfjhe 
test. For example, a trainee «ay be required to repair a radioTtele- 
phone. The "product" sought is an operational' radic/ telephone unit. 
-Process" infonration »*1ch might be noted" includes s^le of work, 
care of tools, and adherence to approved procedures. 



figure 6-3 shows sonie typical steps yfoicb help ensure stansJandizatlon 
[your am b^avior in tesf administration propedtires- 



FaralTiarization 



id instruction^ and test 
[^possible observe test as 
q^en by another tester 



Set Up 1 



• Check envlroraiental conditions 
against ttose specified airf 
•adjust liJjen necessary / 



Set m 



♦ As^enble necessary test eateries 



Instructions 



♦ Read to subjects 

♦ Give Order to begin 



Scoring " - 

jrrOurtng or after test as per 
specifications 



Adirissjon and briefing of subjects 



♦ Oieck that work area is standard 
for each subject and that all 
have necessary isaterials 



Process observations ' 
••Subject's iryHvidual ^rk styles 

• Adherence to standard pcocedui^ 

• Etc. 



Use of Scores 



Figure 6-3/ Sane Typical Testing Steps 



. Reoeniber, you oust ensui^ standardization of all^ asbects of the test 
situaticrn. Figure 6tA suroarizes the ccKnponents of the test situation 
^Ich you, as test adtainistrator, cust be sure are standardize^^ 
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CtKpanents 


Exsrcples 


Pnvirrtnmpfit^l Variables 


» Lighting corKjltlonT" 

• Jfolse level 

• Tenperats/re 

• ftjmldity 


Personal Variables 


— ;„/ 

• State of health / 

• Time siivce rising 

• Tlife since last iseal 


InStrucUooal 4 Tester's . 
Variables ^ 

« 


• V*-1tten or spoken 
Instructions 

♦ Variations In tester 
work load (esprclally 

in group test situations ; 
, when process observations 
uRist li^inade as well as 
product evaluations) 


Figure 6-4. Three Coniponents of the Test Situation' 



SCGSINS PROCEDURES 

* < 

The ain of test scoring procedures is to obtain an accurate estitnate 
cf the trainee's corapetence. The less a test resenbles a "^ds-on" 
laeasureraent the nore difficult it is to reach an accurate pet^oironce 
nieasure. In cases where ,the -rseasures are perfonnance ratings, you should 
use several raters to judge tiie performance, rather than using a single ■ 
observation. Be sure that raters are capable of caking the judcpents 
required. You are then in a position to assign scores with greater 
.confidCTce, provided that the raters agree s)ong tfiecselves oost of 
itbe tioe. if interrater agreeaent is very low, you shoald hesitate in 
interpreting the results. If interrater agreesnent cannot be achieved, 
the te^t. it«3s need to be reevaluated. (Hone aboat this in -the Rating 
Scales" section of this chapter.) ' 



A m±>sr of different typss of GU saxring are cirtfeitly in ase^ lbs 
ppcper soring irethodtis dhssca idth ?«ference to a ^srticular CSI, ajKi 
iff^ CDaslderatlon.of t^ie complexity of ti« tasks asd/cr. j^odacts rsgaired- 
1?je fbllcMifi^ sectioos d!saiss sone coteon. typas of Cm sccrSog, ioclodlDs: 

♦ Assist scoring . • • * ' >^ 

♦ &> - ijd-go scaring * • 
'• Fixed, ^nt systems 

♦ Rating scales 



Ass.ist vs, fen-Interference ScoriiK? ■ * . . . 

< ■ 

In CR testing, sabjects generally proceed frcca tJbe beginniftg to.ej>d 
of a test irithout comaent or action on tbe part of tit& tester (mn-tater^' 
ference). This type of scoring is often used in tests *#jidi call for tJje 

■ coqjletlon of a series of steps or ntich rejjuire prpductlon of a ^re- ' 
specified pcakict- • - * * - . 

SoEse CRTi my^ however, ret?uim scoring each steji In a process* 
JJms; at ead) step, the student 'sTperttJiTuar^ is approv^^ccred "ca") 
or be is assisted (and -scared ^no-go") before proceedlig. Assist scoring 
nay be es?>loyed for di^nostic reaso/is^. Remedial training my iiien 
^pca^ed on missed steps- Thfs saves retrafijing tijne and expense* Assist 
sqoring cay also ftimish valuable clues to arras ^ere ii^stniction laight 
be ifqpxi;joyed. (A large nunfeer pf errors in -step dunfcer 3 of alS Step 
proceciure for example, my indicate an area where instruction could be 
jc^royed.) . • • 

ExaiEple of Assist jfethod. After prel irainary troiniog, a food service 
course objective sight retire testing a trainee's abill^ to'preparea 

■ lar^e ©eal. Here, itjcaybq aH>ropria^ to c4>serve each 'step in 4Jie clean- 
ing, 'preparation and serving of the neal —correcting and recording -ern>rs^ 
as they are ob/sented. If the. entire sequence Is carried out ^ro^i^j^ 

thg product caeasure Will be scored "go." If errors are dfserved, ^Oie 
'^^aiifee cay rstjaire additional training oh the deficien't steps. 87 ' 
using an assist lapthod of scoring, tqx. only is diagnostic infdrsaatiisi 
iAtained, hut a large* jaea> is "saved^-^the Saeal can be sery^ Ihe - 
^fnee-itould be scored •bo-go!' if he was assisted on the'iesL Howeyer, 
jHie need for additional training before retestWwuTd be irffllnl26d> 
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&9sra11y, Dojiiiitcrfereace scariEg is csed idt^i GIB. lbs siii^lest 
ijcaintorfanesjce sccrir^ Is "cp - no-go* sccriog. It Is gsaarallir csed 
to sccre slcple, cbjectlve •hardt^kilT' processes cr proi^cts- Since . 
the sccjs is either *go' cr *a>rco,* action nsist be psrfonaed (or 
tiis^proi^st assfiJibled or cr^teaj exsctlj as specified the objective. 
Ibe Iten is fisselit!a11> hs cbsenrsble ejqarfissi^ of the stan&rd la tJie 
cljjective- Eitlser peribrEance the item aasets the stan&rd or it <fees, 
tat—tbsre is.m *sray ara. ... 



Exaroles of - te-So Scoripg. 

1 ' . 

• A mn is givei 10 ninates to detect and replace a defsctiva 
transistor in a radio set. He either does (go) or does not . 
(no-^o) have the unit operational icithin tJiejallottsd tine. - 

. ■ 

• roe assistant gaaner on tJie K-102 Howitzer has ihe T^sponsi- 
' blUly for setticg^e <?uadpant on the qaadrant sighj^g^ 

firing the wrafon- l?3e required processes ares 

• Tarning the counter handle to the appropriate 
nusierical reading. ' , 

• felsing or lowering the tufbe mitll^the iyjM)ies 
on the sight are level. . : . . 

• rtring the gun by pulling the lanyard on com^id- 

4 

Since this task tan be precisely chected far accuracy, ai 
passing score (go) is assigned only If m errors arfi" jobserved 
oh any of the, above Items. ' ^' 



Fixed foiflt Scoring , / . : 

Another type of CRT scoring method U known as fl5^*point scoring. 
Uiis t^ of scoring is approja-iate »rfien ,the task or* induct tb i>e sior^ 
can be broken into several levels which' siay be quantitatiyely dlstlcjiilshed. 
For exacple, the i tea nay carl 1 for adjustlrig valyis to specif l# tp1graDces. 
If the trainee adjusts thea to the otafct tolerance, M gets 4, points- If . - 
he adjusts then to within ± .DOT inch, he gets 3 points, ± Ucb = 2 
points^ * .003 = I point. Mo points are awarded If iSie twfnee Is.off by 
+.004 of an Inch or mte, « » 



^ altemate type of fixed fofet storing jjses ^50 - ro-go* dstisiarrs 
on conjc^snts erf a tssi. ^Fcr es^nple, tr^fn^ jay asked tso cveriaal 
a cartenstcr, and a joint value assigned to dSffere^it ccii!;o323ts of>ti^ 

, . Task Qg&g-lgtfDn 

Ccmct disassen£)1y of carteaietor 
Correct cleaislig of carfejretjsr - 
Correct rerplacanent of Jets ssd 

psrts cf carirjretor 
Correct rsiustallation of tar4)3re&r 

A scene of 4 loifcat^ tJiat all con^oneats of the task !ave bean correctly 
psrfonied. If tJje trainee failed to replat« tiie Jets asd fliHt but oor- 
rectly ^rfcnaed coxppnents 1, 2, and 4, bs wjld score 3 points on tta 
task as a *roale. A single test could test several tasks, each retiring 
ps.rfcTOance on multiple cc^ponents (subtasks). " ' 



Soiats 

1 
1 
] 

« 

] 



Scoring is generally done using a checklist. A3? behaviors (or.- 
proiacts) reqaired by objectives are clearly deficed. If the objective 
invdives a product, scaring my CTmpare the trainee's prodact idth a sani- 
ple product, -for e>sjcp1e, if an objective requires filling^ sandicg, and 
painting a dented jaetal surface to appropriate body ^pp standards, each ^ 
finished prodiict (the p3iot»d surface) is compared to standanl product. 
Ihe top standard is a srooth, high gloss jastal sttrfsce. If the trainee's - 
product is similar to this, he receives four points. The next standard 
is a snooth, high gloss metal surface with slight ripples, if the tratinee's 
product resenijles this, he gets 3 poi4Tts. This progresses'<&Kn to the 2cto 
point standard,, vSiicli Is represented by a laetal surface vhich is finished - 
^ poorly that- no points can be assigned. 



Mixed Scoring Technioues 



Saaetiroes several scoring procedures can be confined in or*e test. 
For exacple, suppose a test for the position of %dio/Te1e0ione Dperatcr 
has the follovdng overall objective: . 

* "RTO (Radio/Telephone Operator) mst be able to ualntain - - 
the pack-mounted radio. Jfeintenance includes eljeaen- 

. Ury troubtesfipoting, spot painting, periodic.checks to/ 
nibber seals for cracks, and chei^ifig t^le connectfoos 
for fraying- Ihe operator oast deaonstrate afiilfty to 
translate and transrft fregaencies and call signals sf 
necessary units designated in the Signal Operating'Instr^c- 
tions. He laust also deaonstrate ability to key iJbe encoder ' 
iffth the CryptograpMc Acfess Codes.* - 




In this exaanple, wt Ian i^i^tify ssve^i dbile^v^ ts tie HsMsrisi 

j&l3ity to araintafn eq-jl p asnt in %cr*lBg cr^ 
2- ^illi^ to ;^i£tei5cct 6^^^ve € q ui p cenl - . 

3^ j&ili^ to correctly ii&^ti^ iaaxsitQ messages ' 
4^ i&ility to accjiratelv translate f rxCT^f ng Wd ^ sg ^ 

Sd, j#e teve broten dwn tJe ^duties of tJi§ HTO into 5 separata ^ill 
1^!^hlcb cay be tested arrf s^^red .^^rately. 



Objectives 1 and 2 inright be sccrable on a ^ - co-^ !:asls. (Trainees 
ar€ gl?^ a defective ?1^*25 asd onifens anpsnts of tiiae fe> fc»e t^efr 
set operatiasal.) Obiectives 3, 4ij and 5 iKniever, jsi^t be scci^ed cn a ^ 
pojnt tesis (go assig^aed for a score atcye a cut-^ i^int ferTljeloi* 
lO) percent) • tf itecs pertainiog to se;«rate skills can be gro2;?ed airf 
scSred tcgefeer, th^ is no real prcbleia in t^ing an cbjectlve iMdj 
is conposed of diffefaat sdbtasfe. 



RatiSg scales cay be used to score tSCis^^ deaJinj idth laore com- 
plex situations than iirostf iisvolved in 'go - no-50* and fixed point sys- 
tens* If the objective specifies ctadrajteristics of an ac^ptabie «tion 
or product, a rating scs*e csay be apprc^ate* £ic|) I tea CKist be assigned 
a value on an explicit basis, so tJat independent ^raters tdll'be able to 
agree consistently m tiieir scoring. If pgssiblej use tkro pj^ aira^ raters, 
*fto work independently* ' * . • 



To obtain a rough estirate of interrater agrcem^, tine iip ^e 
scores that each rater assigned ^ch trainee on each iteS* figure 6-5 
sbqucs zxi exanrple for a six-itea test taken by six trainees and scored 
by three raters using a 1-5 rating s^le« - \> 



IjpQkf ng across a row, you can cocrpars fJ5e sabres ^signed by the . 
different raters for eacfi trainee. In tfee saa^ple -data presentei, 3^ 
can see that there is perfect agreeraent aiaong rafers [on Itess jwe and 
five- On itecs two, .ttree, and six, ttere Is sobg ^szgre^aent/ Oa/ 
itesa foor, interteter^spreeoent is veiy Tow~no raters^agt^ <^n'£he ^ 
score for any individaal^ ir,d there is a^ jc^^ge of four points l^el^ieen 



sswe ntioss m ifett ifem. lhas, itaa fear %«3ald sfl^kr fears 1» te " , ^ 
lirastfcil nsui^ to iacn^B Jnterrater egceaasnt, cr ^^rsg^ from 



Itaii 



_Traioee 1 



IS 



Trainee 2 T Trali>ee 3 i Tialn^^f. 



Trairsea 5 



3 

5 
1 6 



sis 

s 

S 



3 
'4 
4 



4 

2 
4 
3 



3 
4 
4 



3 
4 
3 



3 h. 



4 
3 



4 
2 



3 

'4 
3 
4 

3 



4 

3 
2 
3 

.4 



4 

.3 

« - 

4 
3 
3 



.4 

3- 

3 

3 

3 

4 i 



2 
1 
3 
1 
2 
3 



2. 
2 
2 
■2 
? 

3. \ 



2 
2 
4 
2 
2 



3 

,4 
.4 
4 
4 

5 



5 I ^ 



3' 

4 
2 
4 
3 



I 

1 

2 
2 



3 
1 

3 



1^ 

2 
1 



2 |2 



=^g«=afcter 1,' P^^P^tef 2» IJ^^i^tar 3 



Figure 6-5^ Coniparfsdn^ of i^tli^ cn a Test 



Ihe point systea by ifMdi olyrcplc divers are cosgared to an "Ideal* 
dive (perfiect perforiance of objective) Is an ejfajaple.of a iratfjg scale. „ 
Biver^ are not teiog corj^ared directly to ach, oilier, to a hypothetical 
•perfect perfonsanc^' froa i^idi all divers fall siort in sdoeiey or 



In developiiig rating scales, the point atssfsnnent snast l)e tial to 
.criterion levels specffieJ in the obiective. If (»ssib1e jfofnt assigh- 
joents sfKTjld be i>^yiora11y-^id3ored. FBr«?as§)ie; 

1 = does not complete iob 

2 « coicpTetes Jcfe in 45 MrSates 
- -3~=7cc^T?fes 1^ In 30 lafnates 



4 = «)cp1etes #b-ln ISiaicat^ 

5 ^^:c^letes Job fir 5 lafmifes 



There are j»«cise statistical te^^qaes for a^^iring ihterrita{ agree- 
jfeat- Bsr.exaaple, see; ' ' - / L ^ - - 

Sallfoi^, J. p; Psychoaetrft Methods. Zad.^Hi on^ Itor Ycr*: «c^rw- 
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' S:adj te&ayicral andj^rfcig ndll help to fuprtrve fnterrster agreemsit. 
The tecfejique is, rnvertijeless, clearly mre sJbSective tb^fi the fixed 
foist system, end tiierefcre, p3ac^ additional respDnsibilit^ en the tester. 
S^tisgs of ill -defined, *glcr!»I b^jevicrs sfKrjld avoided erytirelj^* Fcr 
eanple, a rating with itens sudi as ^ does Job poorly" a!rf *5 = 

. d>es job very welV be suitable sinte it yfoald be Tifcely to neas^re 

rater attitudes and opinioiis rather tte^a the rated person's perforca^. 



Fi5:jre S^^sunroarizes the thrfee ty^'of CS(T scoring that we've diso^ssed. 



1^ 


Stx?rit>g KjEthods 


Exanple 




Behavior performed correctly 
or not, product pnod^c^ 
correctly or not 


Trains ^aust i^jnp trencJi after \ 
croucijing and checking fcr 
sounds 


Fixed Point 
Assignment 


Points assigned to elemsts 
of a task with naxlnnm scx^re 
achieved >cen all items per- 
fectly perfomed—naxiinuni 

.fectly perfcnu^ task or 
perfect product; ro points 
are assigned if task is be- 
lo*f niininuis accept^le stan- 
dards 


In a conplex first aid pro- 
c^din^ syci as wrapping a 
bandage* 1 point my ^e as- 
signed for selection of ihe 
proper oanaa^c^ a secoaO 
point assigned for wrapping 
the wound tightly, a tiird 
for covering the wound cohh 
pletely, etc. 


Rating Scales 


Kamerical values attacJsed i)y 
raters to a perfornance or 
product in which jec^nents 
of different raters my v^ry 
an3 therefore scores are 
not fully c^)jective 


Jiidging diving^ or narcbing 
for^fona with valtaes assigned 
to behavior ofTBasis of its 
clos^ess to perfection 


Figure 6-6. Types of (3rr Scoring 



Establishing Cut-Off Scores 



CRTs are designed to assess proficienqr on a given task or o&j©:tive. 
Sidce it is pft^ iir^ractical to insist on complete isatery of the task 
(100 parcent of itecs perforoed correctly) it ray be necessary to decide 
upon a^ cut-off poir>t {a so^re below which is considered Jailing or ■no-go'). 
The »re cocplex tJie skills assessed fay the CRT and 'the sore varied the 
type of perfoTsance or praSucts the greater is the dkr^er of cisclassf- 
ficstton (designating a "non-raster* as a ^feaster,^ or vice versa). 




ISere ^ to fixed nales or fcrnalas fer estaSiis^pg cjt-off $ofni$, 

*2aae3fate ^sr^xitfer neafe^?f 3s:?^aier oeefe are vei^ 
it 3iay be lustifieble to 1oM?er «2it-oftF levels ^paci^lly 
it errors are less critical tfcan dd g ^fc rn d Bce^t ail, 

♦iJ;5er feasible score for an esl^Ifshed ^taLSter*—? feiset 
•iiay be placed so t^at even ii^e narfsna^ isry jscore iMiy 
59 perceit bits. If ire a cat-pff^st M perxsnt,.!© idil 
pass no cue at all. ^ . " ^ ^ 

* - * ' ^ ■ 

• Critical Ity of iba c^lective— t£e §mter the rfsl: cf st6- 
sfentlal danage ta persons cr to pn:5)erty, tte bic^^ar tfee 
a;t-off score should be* - 



If a test is Jieasuring nore 4San one ^objecti^^e and cat^sff 
scores are necessary^ a cot-off level sboald be esfeblisbed 
for ead) objective. . ^ 



For example » if ons objective has foar go - no-go itsns associated it, 
the C4it-cff point ibr tJiat objective oi^t be passing any tfe^ dat of tiie 
four itess. Anotter objective in the san^ test nay fcave ei^t Items^ ^th 
a ojt-off sa>re of passing any 6 out of the 8. Ttos^ a total ^rf 12 points 
are- possible on t*is tvo-objective test. If a persOT stores 9, be doesn^t 
necessarily pass the test- He inay teve passed all fcmr itens assiiciated 
Kitl) the first objective and fail^ 3 out of the 8 associated icith the 
seconS. ^ ---^ - - . . 



Est^Tisfaing cut-off points is a cosplex iiatter* You should reach a 
decision on this ratter, only after careftil consideration of the accept- 
able perfoRiance standards for 'the tas1;Cs) and tasi; criticaHty. In gen- 
eml , cut-offs are usefiil lyhen: 

♦ Absolute sastery-of the tas^ is nftt expected but a stxiizble 
level of perfjonsance is specifiBi in ^fr objective^- 

♦ Solute ^sastexy is postsible biit factors cfber ifcan cosr 
petence affeet the score {^uch as careless errors,^ laeasure- 
soerA errors, etc^)* . *^ , 



false gosltives and T^se Hegatives 



. ^ The teart of (M tasting is that •iatsters* rn^t be ccnrec^t^^ disUn- 
Suisbed frpB* •fX)n-55asters'^ in tenss of spscifi^ brit^k. i^r^nt 



tiat cbn^eteat people are not failed 3und that inconpetert ones are riot 
passed. Figire ^£-7 oitlioes fee concepts of *fa1se positive* and *false 
tjegative* end stows passible results of soch Jirfsclassiflcations*' 





DeflnittoB 


Psssfblc Wessons 
for Error 


Jtesliflc 
&7!sequsnces 




A trziiset is jlvei * or 
jcrlct tcare «crc ti« a;t-cff 
tjtft is realljr om e "testw* 


• lii:^ 9;lessln3 

atjon— test i^ist 
•*^1t* the right 
Itsscs 

♦ 3jes 


laeitt 

• Inifcillt^' tD pfir- 
fonc work pro^ 
perl/ 




A conprtent person tes in 
fftct sastered rhs task Is given 
t filling scare 


• Illness 

• ^iT^OMs behi^rloral 
flzjctaetioiis 

• >te2*ureiient enor 

• 3ies 

• CocpTcxIty of 
iTistructlons ' - 


• K>ste cf training ^ 
»ney 

• Possible ^znmll* 
a^MIity ^f coo- 
petent ^tas iftsdase 

Ms SUllS ZTT 

unrecognized 


Figure &-7. False Positives and False Negatives 



Figure 6-f stows that the consequences of either type of error my be 
extrenely costly. Since CRTs jiay be enjployed to "assess competence In 
widely varied tasks, it is difficult to mice a several rule about appro- 
priate places to set cut-off levels. However;, a good guideline is speci- 
fied belCT*. ^ 



If the cost of a false positive (passing an incospetent nan) 
is very high, the ciit-off point should be set very high. 



7W,s,*rill ellninate trainees who are fairly cosnpetent (but not "teasters:). 

One t«:hn1que 'ft>r reducing the nuabers of false positive and false 
negatives, therein r^ucins the likelihood of BlscTassification, is to 
Increase the nusber of test itesss in use. It ray be possible in soae 
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sitaaliciss to fzjcrsase the rmber of Itans '%imply iy rsp^Jng" liie sane 
i6sB acre iSin oace (is la re^iiHrg stsjdsat pilists to land a pIslss en 



. l&tcnJfog and reporting C2T r^l'ts mst 'be doune iiia |:iattise, fectoal 
jsanaar- After ^^nistering sirf sccriog tie tesi> ^t&e fest^ n^,- la 
eadftion, adsh to obtain ^gjUfo nal fcfonnatiosu felloirfcg. ste^ 
.^bald fee tatea after dSsmsslcg tfee trainees tbe testing ^ifcatlon* 

♦ ^letrieval and storage of relevant test raterials, tf ' * '* 
any (pencils, answer sheets, rifies, <£OT:y aaifies, bIc-)# . - 

- , J* « _ *■ . 

♦ Spot'redieck of trainee's ?^c»rds for legibility. 

. * « Rea>rding of any additional process or ^rodiKrt Inftr- 

ration y&ich ^e tester obso^ved and considers islev^t ^ 
toiassessing the asastery of the task. - . ; ' 

B^yioral obsorvations arfiii^ Jiay shed light fee intet^i'^^tipaof 
test scor«.;should be iircladed, with results lAenever possible, tor e^s^lfe;, 
if trainees consfstertly con^jlete all tasks on a go - no-go series in a 
♦itery sfejrt tline^ this inay be ralB'^ant to future training. ^iOn ^e o^er. 
ftai^> a student lay successfully get hfs i^dio in operational sba5>ei but 
iise.^n excessive arxmnt of inaterials in doing so, or iiay da^ge the <^fng. 
istrictly adherirg to the standardized storing of the test ilght Indicate 
a *^ score, but the tester cay feel the taslii Kas carried oat ic^jroperly. 
the a^rrect course of action in this i>ase is to score the fndlvi&^t ac- 
cording to standard procedures^ but to supplaaent the re^^srt idtfa approninate 
observations* ' . ^ ^ , 



SPECIAL PJlOa£HS^ 



Standencdizing fonrsat^ adainistratioh conditions^ >ai>d scprJng^^of a CRT 
wil? jpffliaize ujmsua\ problesrs- ^ Heverthel^ difficalt cases fey ippe^r^; 
for exai!ple^ ^ - ^ ^ * \ / : \ ^ ^/ 

# A soldier balfiey thr^gb the ojity avail able f ora of f: tgt- 
- develops an-illj)ess (or is fbr sp»e other-t^ltlipte re^on ^ 



imable to contfhueK Ibere isL Jrto second font^f the'^t 
and the soldier Ijai already seen im.first fore. Jfett^^to dor 



■#0?r resorts fcr a grcap of lasj laast 6e obtained fciCTsdiafsly, 
bat feiJrs is icadegaals -staff •{>3^i3cne1 to cbsa-ve all of 
_ the iTOcess lofcraation rei^-jired to assess ji&stSSr cbiectlifgs 

♦Ihe CO rei^ests tte^i^es of the Sjaost slcilled 'soldfers* 
Tte CSJ.iijOKS IS asn wifii ptrfact scores. How are i3je feaor 
-^radaat^ tJiosKJ? ' . ' 

Sadi fsroMens are not itit&mal to tfes (HT, bat inrolvE oatside onstraiats 
or denajjds «h'idj cacnot be met iriJi»-jt weakeaicg the staadardizatica of 
tie test or tsing it In a ley fcr »i*rich It teas mt desj^ed. 

la situations sodi as these, ytm mst decide. In <»n|i:nction irith 
other interested {arsons, nint are likely to be the aists asS resalts. 
Ibe aaa in tie first aanple lAo developed an illness dariog the test 
jrfght be observed individaairy io a "hands-on" sftuation to assess his. 
coa^tSBca. jDr, i^ien sanpoifa* needs are asnsidersd, this partiailar 
person aay not be needed for that particular task. Aasifers to siith 
questions can only be decided by personnel in a position J» assess €bB. 
needs of the program, the san, and the costs of vario!is alterz^tisres. . 

If special considerations ^eea to ifenand that testing is needed iiaoed- 
lately {even if the standardization of sojring is below par the to a short* 
age of trained personnel, for exan5)le} t*e person reguestiijg the iiaaedjate 
infonnation should be inforaed of the dangers involved. If it is still 
necessaT7 to adirinister the test under such circusistances/ali scores are . 
called into question, and this should be ndted on the report. Ideal3:y^ 
a retest with an ^ffltemate fom of tJie sans Olt should be administered later- 



Final ly, as has been enphasized previously* it is not usually appro^ 
priate to use 1317 results in a noroative way (i.e., deciding who is best 
^ng those passing or worst arong tJx)se failing). A KRT is called for 
in such cases. CRTs should be used in such a corttext only with tJie great- 
est caution^ and preferably not at all. "* 



See.fhe section «jtit1ed "CRT or KRT* in Chipief I. 
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Ftcure 6-1 . Sequence of Operations for 

Administering and Scorlng^OlTs 
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ASSESSII4S RELttBiLiTy AJ0 Y;u.iomr 



Two very iir^wrtant activities Tesain after you tave developed your 
GIT— ©asurlog the reliability of ypur test, and deteroiirfj^ your test's 
validity. 



Reliability 'refers to the extent to which a test yields consistent 
scores: If a test has high reliability, the saiae people should fail each 
tine they take the test, while those who pass should do so consistently 
(assuniing that no learning has intervened between test administrations). 
.On a test vrfiich has low reliability, on the other hand, people of siirflar 
ability on the task inay vary widely in their test scdres, wltii 'sojne passing 
and sane failing each tiroe they take the test* If aiest is highly tin- 
reliable, the saiae individual nsay pass it one day and fail it the next (or 
vi ce- versa )^ust by chance fluctuation^. Thus, ft is essential ttet your 
test be reliable: Uf it isn't, using it would i>e like using an altljneter 
which socetioes reads •^■200 ft" vtfien you* re at 200 feet above sea level and 
scxnetiiaes gives the sane reading when your are at 18 feet above sea level. 
The results of using an unreliable CRT are likeiy to be nearly as unfortun- 
ate as flying a plane with an unreliable altimeter and, conceivably, 
equally disastrous. • 



Validity refers to the extent to vrfjfch a test actually measures lAat 
it is supposed to measur^. Jor example, consider a multiple^oice paper- 
and pencil test on first kiCTrocedures, developed as a low fidelity 
measure of ability to, admirnster correct first aid treatment. This test 
may be reliable— that is, the same people may score about the same on it 
each time, they take it (or take alternate forms of it)— but it is not nec- 
essarily valid. To derernine if it is valid, you woul^have to determine 
whether a high score on the test means that a person can actually adBdnis-^ 
ter correct first aid treatmoit, vrfiile a low score means that he cannot, 
in /other words, just because a test is reliable does not necessarily mean 
th3t it is-va^4d^ * - ' 



On the other hand, a test which is not reliable cannot be valid. If 
a test does not give consistent results, it cannot be safd to measure any- 
thing accurately. Consider the altimeter whidi. sometimes registered ^200 
ft* at 200 ft^bove sea level and sometimes *20Q>ft" when actually al 18 ft 
above sea level. Is it a valid measure jof height above sea level? Hoi It 
clearly is ndt accurately »rasurij>g altitude* 
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Sfiwos^ tirfs S235ie altineter consistently registened *2m ft' K^jen ^ 
p1ane.«s flying at 2)0 inph, "400 ft" at 409 Jiyih, "^SO ft" at uph, etc; 
In a sense the .altimeter is. "reliable*— it giv^ the sans results sindsnthe 
sane ctmaitions. But a wire is crossed soni»here, the altiiueter is measur- 
ing airspeed—not ^ihst it is supposed to be neasuriiig— altitude. 

CRTs, of coarse, should be as' reliable and as valid as possible. If 
yoal^ve ft>l lowed the ste^s for thz «>j3structioD and adiainistraticii of GCts 
oatlined in the precediig chapters, you bare alresdj^ gone a loig »rays tc^rd 
iaxiniziig reliability and validity^,. The steps ^presented helped you •build 
in' reliability by standardizing test conditions and Iqt iocraasiM ^le mar 
ber of itecs in your test. The itesr^l tryoat and review precises - 
helped you iDcrease reliability and validity by selecting the b&t^d ost 
consistent itens* .fetching the iteis to the objectives helped you texiciizfi 
validity by assuring that the test items loeasare nifatt they are supppsed to 
iseasure.. 



Kevertheless, you cannot assune that youc t^t is reliable and valid 
, enough to be useful siirpTy on the basis of having carefully followed the 
>CRT construction process. There are many potential sources of error that 
can lower reliability and validity of the jaost carefully thoughtnout test. 
Khafcyou cust do, is to deteraine your testes reliability and validity in 
actial use. This chapter presents techniques for doing tJat. Figure 7-1 
(folddut at the end of this chapter) shows the sequence of operations in- 
volved in assessii^ reliability and validity. , - 



\ ASSESSING RaiABILITY - 

• . - \ ■,. , , . -• 

•3 ' • - - 

The first thing to do in evaluating the usefulness of your test, is to 
assess its reliability. If it is not reliable, there is little sense ^n. 
checkijvg- its validity. . When you assess the reliability of a test;, you are 
essentiatty-asJcing ,"how^ consistent a measure is this test?" 

: ' \ " '''r^^ ■• •■• • ' - 

A CRT, like any iaeasuiresent device, has possibility for error In its 
use. Xons^ider a ruler, probably the sinplest type of.cfeasureaeht devicei 
If yoy-ajeasure a person's height over 10. days, you vouT^ expetet to ge,t th^ 
same iritis on each day. Biit^ tfere will always be sase in^ureaerit 
error, even under the best, standardized conditions.' So, the pfst day, 
you may find the height to be 5*9-5/32", the. second day S'S^l/S''^ the ttird 
day 5*9-3/16*, etc. The extent to which your jaeasur^nt is cqhsisfent 
over repeat^ trials def inc ite i^etiafilHty. 



CoBpgtipg * as an Estiaatie of P^liablllty 

Que good my to estinate the overall reliability of your test ts to 
see the awislstency ydttt y&iicb people fass or fall it. Tbe principle Is: 



% the test is reliable, people *Ao pass the first tfiae should 
pass the second tice, >^iile people who fail the first tliae, -should 
fatl the secoj>d tliae. 



Reliability estlcates based on this principle are caHed estlnates of test 
retest reliability . 

In Chapter 5, >t>u saw box to c^pute * for 1 ten analysis purposfes. 
You can also use ^ as 3 simple estiisate of test-retest reliability. To do 
this, you should have rgroup of at least 30 people to whoa you. can adciin- 
ister the test tadce. These' people should be sampled randoaly from the 
population of people who iwuld ordinarily take this test. In order to 
estinate test-retest reliability properly, you need to test the sarae group 
of people tvrice, elos'e together in tiiae. ' ■ - 

• You should let only about one day elapse betweai the. first ' - 
tiae you test thea and the second titae. 

Another important point is: ' - * . - 



Oo not tell the trainees that they .will be tested again. 



This is very important since you don't want students to practice between 
test adninistrations or try to recall the test in detail. Test-retest re- 
liabilrty assunes no practice between adainistrations and equivalent condi- 
tions both ti«aes. So, it is helpfuVif the trainees are kept odtopied 
•between adainistratioas and don't have- time to practice. ^ 



"Equivalent conditions" applies not only to the test envfronraent but . 
also to the trainees thenselves— trainees should be equally rested, equally 
hungry, etc. during- each adafnistration. . Thus, it is a good idea to test 
ffim at the saae time both days.- ,Jmt^ ' -V - 

To calculate « for test-retest reliablTfty estiiiates, set op your . 
results fnxa the two test adBinistra^:ions in a laatrix suirh ^ that shown 
iji Figure 7-2. > * 
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figare 7-2, fetrix tJsed /or CcBi5wting * Ln Jest-Retest SeilaMlity 
Estlcates * . ' - 








sr 



You fill iXit this isatrix similarly to .the way ymi filled <>i^t the iteni 
analysis tiatnces described in Chapter 5: 'in cell A, you enjfcer the nunfeer 
6f people )i6i'o passed the test both times; in cell.^ enter the nmber of 
people who failed the test the first tiiue:, but pasi-ei it the^second tiuie. 
In cell C, enter the n!3n:4>er of pedplte who passW^Jje test the first time,' 
bat faile3 it the second line. And in celT 0,.4pter tie nunfcer of ^ple 
Who failed fei test both tiioes.^^^e rarginal ^al A^>shOKS the number 
jrf people ubx> passed the second test'adiHinistratftn, yfyiheX^Q shows the 
nuEi>0^-«bo faijed the second ticE. shows ih^ nany failed the first 
ttrte, ffh.ile A-^C shows how inany pass^ the fii^t adnrioistraticQ. 



figure 7-3 shows test-retest matrices fUled out for^^two *di#fenent 
tests.^ L6t*s use these matrices 'to calculate an estimate of *ta§t#^^e5t 
reliatillity for.6adi of the tMo tests. - , * - 
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Figure 7-3: t^tfices for Test-ftetest Reliafallity^stiijaies WitS 
San^le Bats, for Tmd Sifferfent. Tests 



R^nsEfeer that the ftJnrala TOr computing * is: 



^ = 



Thus, for Test A, 



6 = 



V(A*B)(C4{))(A*C)(B*D) 

(14)a0)-(5)0) « 135 
- VOO(3S)(15K15) -V47,025 



-135' 
216.85 



.62 



And, for Test B, 



Og}(10>'(-10K4) - i ?2Q 
^{14)j(26K20)i20) Vt«;600 



120 



381.58 



.31. 



So, Test A is jaore reliable than Telrt 8, in tenas of test-ret^ relia- 
bility. But, what value^of ^ indicates that, a test Is-'siif^ciaitly 
reliable? A useful rtJl€?of-thirab is: . . - * 

1 . ; 



A ^ less than +.50. indicates that the test is of guestiaj^bje 
reliabf 11^^ A ^ of *,5d or jaore indicates that th^ test bii 
5ufficiient reliabfli^, {Receci>er that ^ cah^jcai^e frba -4.00 
tfiixwsh i3 to *1.005. 



Tfcas, test A in (rjr e«irpl8 §*ja1if1es as 'relidjie, Tesf 8 ^Soes not* 
Sistteraber tSat *.S0 is i r{ile-<>f-Sian6 ani skrjH not bg, fallowed rigidlj?^ 
'For €>3inple, if yaa fbsnd 42at one test ted a t«t-ret^ reliability of 
.52* i&ile anrtfcer lad a reliability of .43, joa wo^ld not'te iustlfied fc 
sayifQiitJ^t the first nas rel table arf s^50d5 



. *SS^IJB VALIDITY ^ ' CI, 
- * " - • • • 

Gnce j^u bare detanniioed t3»t yDur test tes aa:e3?table relisbility^ 
yoj tzn iciim yo^xr attention to validity. , A reliable tKt doesn't 
jieasure tbe i^r^iate thiig is jto tetter t2;an an unreliable t^t, Ifcere 
are tijree Igrpes of validi^ tJat are reo^ncended fpr CSTs; 

♦Content Validity * ^ 

" •Concurrent Validity - 

' •Predictive Validity 

Each type of validity addresses tbe question *o6k tJbis test laasure imat 
,it is supposed to injure?* in a different yay. "Figure 7-4 coin;«res the 
three types of validity. - . 



Type 

• 


• Hm It Works . 

* * 


liorf To Cetensioe ' 


Content 
• - • 


Cornpares^contents of test to ^bjeQtives— 
Do itens Measure Khat the objectives say 
they should iseasure? 

* ♦ • 


Systenatically, bat 
nonstat^ stf ally 


Concurrent . 


CoH5)ares results on test to result on 
^anotJier ireasure of tJie objectives — is , - ■ 
success {failure^ on test associated, 
with success (failure) on another 
jaeasure of the specified perforaar^ . 
taken at the saiae tii::^ (concurrsilly)? 


Statistically 


Predictive 


♦ • * 

C«;^res results oir test to results 
oeasured later on tbe job^-Is success 
(failure) on test associatedi idth * - 
succe^ (failure) on another jarasure. 
of the specif ied-perforcance tak^ ■ 
later> *fhen the trainee is actaally 
on the Job? ' . , 


Statisticalljr 


figure 7-4* Three Types of ValidJ^ ^ 



How let*s discuss each of these types of validity s^nixily. ^ 



OetaTrialpg Coatg?it Validity 



CcntKit validity Is pmbafalj' the single best ^'of assessing *^tber 
or not yorjT CST lasastrres K&at H is sapposed to cieastire. In assessing tbe 
content validity of a a?T, you systs3atlcal1> chsci to see If eada test 
Itsii Is rasasurlng exactly »dat tSie associated cbjective says It should. 
If all I teas laasure yt^t the objective calls for, tee t^ Is costert 
valid; If they don't. It Isn't.* A slcple exacple should fcelp iate tJris 
clear: S:^;K>se yoa have a oae-lteni CRT. Tbe Item and Its. objective are 
sitonun In Figure 7-5. 



Objective 




fiiven the appropriate tools, per- * 
form routine preventive i^inte- 
naiice on tfce 45 )M g^io^ator as 
^peciflej in the operating and 
iiaifitenance iianual for sane, 
within 30 minutes. 


' iTi front of 3^ is a 45 I3< g^era- 
tor and ^e appropriate tools. 
Perfora routine preventive cainte- 
nance on the generator ^ specified 
in the operating and irdlntenarce 
tnanual. Yoa have SO uiinutes to 
coinplete this^ task. : 


Figure 7-5. A One- Item CRT and Its Objective 



Does this test have content vaHdity? Kell, perfoniiDg routine pre- 
v^tive naintenance on a 45 KH generator {the test) is obviously tbe best • 
measure^ sf the objective (perforaing routine preventive Jiaintenance on a 
45 KH generator). So the test is intent valid. Ti&t is> there is no 
i)etter vay to m^ure the objective than the test. Of course, if the ^ 
objective Itself les not properly developed, then the test is useless. 
That is, if th6 people you are testing are being train^J to troubleshoot 
the generator, ratiier than to maintain it, tfee objective— and any test 
ba^ed on it— is inappropriate^ - ' 



Content validity, then, is a iiatter of tJie* extent to ^ich a test 
corresponds with its objectives. Content validity is best viewed as absq- . 
lute o^ureoent. Froia an absolute point of view, the results of a 
suggest that either an individual does possess tte^ility .to adequately 
perforn the task which tbe objective defines, or.he 6oesn*U If the test. 
ite!3S 4nd objecttvfe(s) are, precisely csatched, the test Is cont«it valijd* 
If all itess are not precisely catcbed with their associated objectives, 
the test is not content valid. . The iteos i2ust be representative of all 
aspects of their associated objective. .Thus, if the objective involves 
applying a concept which has three characteristics, tJie ftefls roust ioclude 
a1 1 three characteristics. 



IF 

This assusaes that the objectives theaselves have been derived from an 
eppropriate analysis of what the trainee crust be able to do» ' 



So, estsili^pg ccr^teriX .validity is sinpl^^a mttst of systsnaticall:^ 
cbecticg cbjectires ^rd itSRS. Basicallj^^ .there ars tm step% involved: 

♦fi«t» cfeeck to be sa^ tte cbiectives iave praj^rl^ derived 
froa as aralj^is of i&at the trainee SGSt fe>w ahd/cr &> fo crdsr 
to p&rfcrsi tte ta^ for nfeidb tbejf are teirag trained. 

♦ Second^ <±eck test itea against its assc^iated olyective to 
see If i^B iten neasures e^^ctljf **at the c^jjective ^ys sJpald 
le ty^sgred. Be -Sure that the item ojvers all aspects of the 
objective* ^ " * 

If boa d^ecks are affirnative, yoar test is consent valid^' • 



If yorj have nany itens cn yojr test associated untJi cia objective, be 
sure that each itea Jieasares exactlj what the objective indijcates. If yoisr 
t»t includes na^y objectives, each with wore thai one item, checi each item 
against its associated objective. Do this systeiatically for ^th item, 
and yoa*ve assessed the content validity of your test. « 



You should be aware of ,the follovring prirciple: 



If objectives have been properly developed ar^d the test 
consists of high fideli^ iteas based on these objectives, 
your test >rill probably be content^valitJ. If, hgwever, 
. the test consists of Dediua or low fidelity itens, it 
probably will not be «>ntent^lid* 



So, if yo^ have a high fidelity test, and a systegatic cbecic revels 
that it does mt have content validity, you are in trouble— sanething is 
• wrSig with the test. Either its objectives are mt properly derivai froia 
• a task analysis, or its itens are not inatched to the objectives, or both— 
^teck to the drawing board. * " 

•V 

#^ 

135gther or not your test has content validity, you should also axipute 
statistical estimates of concurrent validity, predictive validity, or both. 
If ypur test is content valid, this further assessiaent will ansijfer impo**- 
^ tant additional questions, such as: •'How does perfonnance on the CJff 
compare to ^>erfonsance on another oeasure?* ^ 

If your' test Is composed of low or laedioa fideHiiy Itecs ind, conse^ 
qaently, has lower content validity, statistical estimates of validity are 
. of prisary ioportance. For exanpJe, suppose an ohtjective states:. 

•"Beable to execute props- diking obtions fn i low gravity imrirr 
on*^ such as the iaofon."- - ^ ' 

. • 141 . . 

4, ... - * 

ERIC ' - . ^ V 



and a one-^ilea GS^eteio^ for this c6Jective' states: 

* •'%Lke tJir^ sjte?s in a symrasiuiD osfng the proper technique fcr 
. a loir gravity errvircnnent.' 

/ 

Ibe item <toes o3t Jtfeasare exactly nfhat tbe objective calls for, so the test 
is tot content valid- Howeva^. it my be valid in aether s^e; but to 
deteroine this, y^j vill ibave to use ei^^er a concurrerf or a predictive 
laeassre of validly* 



DetermiofM Ccrcixrrent Validly 



Ccncttrrent vaiidilg^ conpares iridividials* results on yoiir CRT irith 
their results cn sone other inasare of tfe perfoniance being tested by 
yourXST. Individuals tai:e tte CIT and the other laasure close together 
io tine (concixrrently). The ofeer inasure rost be fee best available 
assessingit of oerfcrnance on tae o&jectlveU) m crJestloa > A ..statistical 
4etennn3tion of t&e d^ree of assciiatlon between rssiilts on the OCT and 
r^uits on the otfaer js^swre idll provide an estinate of the cc^qitrent _ 
validi^ j^issessed 3the t3?T- 



Otfeer jnasures connonly tised to estafali^ concarrent validity wi1± a 
CRT liJcliKie: * • 

^Exlstiig tests alr^dy in tise 

• liistructor ratings of students* perfornance 

^ •Hioher fidelity versions of the CRT being validated, 
ajtf^otfi^ ^ 



For pample, a CRT on first aid techniques inay be validated against 
instructor ratings of first aid achievenienti or, U inay be validated against 
an existing first aid test tdiich has worked well. A cnilfiple^holce CRT . 
on vocabulary (such as: given a word to be defined, choose Uie best defi- 
nition—A, 8, C, or 0) iiay be validated against a fiTl-in-the-blanks ver^ • 
sion of a vocabulary test (such as: here is the word to be defined, write 
a sijnple definition in the blanks below)* The fni-in-the-btanks test is 
a higher fidelity neasurS than the ailtip1e-choi<^ test. Reoeraber, though: 

f The other inasure oust be a suf feble one. If you don't have 
another ineasure which you consider suitable, you cannot establish 
the conoxrrent validity of your CRT. 

Once you have chosen the other wsiswre to use in establishing the 
c6ncurrer»t validity of ypur CRT, the statistical deB*^jp5i3&ticn is fia§y: . 
^ is zjstfn zpproppidite. 



wt rill^l^ of 2 ziaf cn l&^se^Ap sHUs.. Is t2» pest, icstractcr's 

.resslts. To est^tls^ t&6 co-^aarest valfdlSf i&f ^* lays jcar :MuipTe 
«n1u^ J5cr le^s^p sJc!lJs try l^je In^njctscr^ test l&anT^siig JJis 
t3tr> . Itocord e« resales In a ostrbi siKadog t2ie iKiii)ers of p^Ie passlfi^ 
aaS fftlTfiig the GCi zailite jonber of pecple rate! i^q^itsme ^passisg) and 
CMffic^jtaSIe {fan fig) by ti)e icstnictor. Flgare 7-5 ^lais sidi a ^trix 
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Figure 7-5. Kitrix for ConcurrHit Validation Hith Saraple Itets . 



Then the # for concurt^t validity of ymir ladersMp skills GIT Is: 



AD-BC {36)<lg)-t6)(2) 

✓ - ? 

- : -t ' 57B-IZ . ' 564. 

. - *V 632.016 ^ 7j95. * * 



vVou can juse the saae rate-of-tisjBb. suggested for reliajjflity «st?fltated 



if tfie 4 «st!«ate of coijcurrent validity is *«Sd or Wgher, 1 
youi^ US' Is prplia^Ty of suitabtje yalfdl^. If ^ Is a 
betneen +.50 and rI,QO# ^our a?r.fs of qiiestfdnable vitiditj^ 



It is iB^ortajit to late sane that tbe foUoidig coadUions bold ^en yoj 
&t2A^Wl the coiJO ir r en t validly of yD;ir CJJT; 

• Your saTiple rost be ^r^entative of the popdiation fcr yMcb 
tis^CST is icteaied. (^ain, randoa sanpling ttocj ti:e pspalatic!^ 
Kfilieixanplish t&is.) \ , 

- ♦Yp*jr seiuple inust be relatively larse, A random- ssnple of 50 to 
109^ {^plfi nay be used, but be better off using laare tfcaQ. 

people. 



ggterroSoipg Predictive Validity 



Predictive validity is based on the saine c;cncept as conoirrent valid- 
ly, and can be estirated by § in the sanely ^ Onlilte concorrent validity, 
feDagfa, predictive validity conjjares students' resijlts oh jczr jdth 
their results on so^ie other neasure taken at a later tl^e— ^^en tfegy stb _ 
acfeally on the Job for yihidb ihey^^e i^en l^glf^ IMnS^a^ itBX^_M±^ 
.the ofeer jieasare are taken dose togetlia- in tSing f or^ cooonTest validity^ 
tfcgr nay be.s^rated by six mnths or laore fcr pr^idtave j^lfdlty- 



So, predictive validity teHs yoa tfee extfit to whidi results on the 
C8T predict results on the Job. Typical t>5>es^of ^S^ti^ iiseS 4n predic- 
tive validity (predicted by tbe £3115 include: 

♦Supervisor's ratirgs of on-the-^Job per^ornance , - 

♦Other existiig tests (sudi as WSS tests) ' 

♦ Peer ratings of on-the-job perfomance^ \ 

♦Objective .iixJices of on-the-job perfornance^ sudi as asiount of 
pjnoducts tamed out per day (acc^table or unacceptable)^ iunjber 
of mistakes cocctitted -{acceptably few or unagceptably rany), 
and others 

You detensine predictive validity using the same # procedures as for 
concurrent vall<fity. For exan5)le, you jaight validate^ students* jperfpraarKe 
on a CRT of leadership skills against supervisors 'T^tings jof their leader- 
ship skills in their units six conths later. Use the saiae rulerof-thusi as 
for reliability and concurrent validity: 



Acceptably predictive valiflBty is defined" by a ^ greater 
tten ^.SOs • - ' * . 

■ ^ ^ : — ^ — =T ' 

• ' * ' 144 



Tjbe'^SB caatipns tiat apply to concaa-rent validity feld 'tn:e Tor 

•lie weasca^ &53!nst «^fdj joa valf^te tie C2t aast &e saiJ^le— 
^ not it2st_ t?>e pnl>^»esjire5 ayaliaile. (If yea don't fcave anatfeer 
3i»s«are *6ich providess^ zcc^lshle ass^sment of f^tfje-Jcb 
perajnitoce.6n tie task tsst^ by tie £Sr, ysa can*t establf^ 
, 1^ predictive Validfty of the^ on".) s 

^Tpar va1f<&ticn sanpl* liest be rs^JresErtatfye of fee pDpuUt5cH> 
^ 'for^ftfcfagJE test is Intended. ' - ' 

* ' ' . \ , - - - • * ^ . * 



IskT 10 Oa IF YOUR TBI EHJABain GR 

vALiDirr IS i?)o lisi * 



As stated at iSe't^icning of this chapter, yom* {ST jaast haviioti 
aceeptaWe reliability and Acceptable validity to l>e use&i. In sunnaiy, 
iiere are the standards for iadging the aqc^tability of .your CJIT^? re- 
liability and yalldfiy: ~' • - 

♦Yoar CRT has acceptable reliab'ility if the ^ estirate of lest- 
fli" , retest reliability is greater tiaijHH.50.. - • - , ■ 

• Your £ST should be content ralid, unless practical conttaints 
have caused" you to create a Ipw fidelity test. 

• Ycair C2T should have concurrent or predictive validfty grater 
than +.50, as estinated by — ~-~ 



.If your test does not nieet these standards, it is pcgl&biy nc^t suit- 
able for use as an Amy CRT., Thus, you should -either lacxiify it 43r^>ct0a.le. 
a naf test, and then assess reli^lity and validity again. 

Following are some suggestions for codifying your to incr^e Its 
reTia|)tlity and validity^ , ^ . - 

•Iteu can.oftai increase the reliability of a test by-addiiig itans. 
Of course, tile iteos inust catch the ^Jbiectlv^sl.* ifiheXest is' 
aeasurina several ubjectives, yoa must b^ sure to inkit^fTfth^ 
apprppHate prc^rtioS^ of iteos to objictives* Aft^jroulaye 
developed and added iteas, reasiess. the iestrret^ reliability'^ 

.•^ test that is not content valid doeio lafcltirf j&igh ffdeTity ftees 
can be iade contait valid by rebristructf rQ^ ttie iteas in a^iigh ~ 
f fdeli^ forKt. You jay have to jiodifyr ^^actical jjonstrainti id 
do this,, or «ate the test lesis feailfble to adirfnister tphvenlently. 



^ Sat a ^iffialt-ia^-edmimster, valid* test is at l^t siritable 
for ijse^ y^ile a^n e^sy-to-atimmst& test ^it^ iatis validity 
is unstable. 

'^ ^'^^ ^2iva reason to feelievs tfst yci^' test reliability or 
val idi t is too Ion because of iiaprcper sacpliiig tedr-nlijues, it 
^ ^lay be aj^?rc?riate to^ reassess the test usin§ a ce»^ jnore care* 

selected sauiple'. Be sure that tfce sanple is prop^fly^Jarge 
aj>d representative of the .population for idiich t*e t^t is in- 
, ■ tended- Also take care ijist ix^ OTs (and otfeer ueasur^) are 
^dninistered io a proper, staf4ardizedl f^tcn* " 



Do not iHissise this last suggestion: DcB*t keep reassessing j^^r* 
test until happen ixpcn a tiine >^*jen reliability af^ vaiiJSf^jfcjpjfeeck out 
as acc^table. Vou sfet^jld onljf reass^s 4f yen think soi:ethi&§-4es luis- 
fcandled in the first assessinent of reliability and validity,*^^,^'^^ 
TX-dify tha test. The test inast be reassessed fter reliability ir^ validity 
after any and all Fodificaticns. ' — ' " 



If yoij nodify your test and it still doesti't teve acceptable rel^- 
bility and validity, it ray be a good id^ to seek help frtm ytmr tost 
evaluation ttnitl They jray be able to see a.difficaTty that is flot appar- 
ent to^U"th^ irey see the forest, M3^n yo^j've focutsed on the trees. 
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Figure 7-1 . Seguince of Operations imofwed 
In Assessing ^leliabilfiy and 
Validity 
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APPaCDIX A 



CHECKLIST FOR CGIiSTRUCniia Ci?Ts 



You can use this checklist to guidst you through 
actiifiti^ required develop a CRT^ onas you btb 
faailiar-vritJi this iraiiuaj * By usii?g this checklist, 
you Hill be^yfe to perfona all activities necessaty 
for the developoent-of an ^de^uate CRT in the proj^r 
seauence;. Consult the text if you require i)nishiip 
infonnation^,on activities, Renesber, you should 
not iise this cfiec^ist untiTydu have gained fanil- 
iarHy^ith the GIT CTnstructton process by 4ising - 
the nanual several tines • 



.CHECKLIST FOR COKSTRUCnKG CRTs 



□ 

nqfjirtduset. 

□ 

Po ids iii*Kc ob)eca^ extsmil to training 
t^ioM Of carrte $pecffifidL 

3. OetmnineiJ^erberaCRTcaabefauSt: 
Test can be scored on an iisoJutc tes?v— 
cnlotrn^ s u n te cb for acceptable perform- 
ance can be specified. 

4. Obtain 3 list of objectrves to betestsdL 

U 5. O^eck iha obi«:trves ca^l for performaf>ce 
on lunone ta^ 

□ 6. Check that atl tasks are indepen^L 

□ 7. list t\t three -Jndin para of «ach objectrve 

to be tejted— performances, conciitaons, 
3nd standards. 

□ 8. Check ^lat ma5n intents of objectjves are 

c5ear, 

□ 9. Check tha:; perform; 
) $«r^^e, direct, and 

repertcnre of b^a^ 

□ 10. Chec* that 

standards are 
terms. 

. SerK5 inadequate objectives bac^ through, 
charuiels tothetr ori^nator($) for revision, 

CD 12. List practical constraints. 

□ 13. Assess practical constrarm in terms of 

their Impact on objectives. 

□ 14. Develop plan for seSecting objective!, if 

appropriate. 

□ IE Modify objectives^ as r^eoessary. 

□ 16« Send modified ot^ectives ^tbu^ channels 

for approval. 

□ 17. Oetermir>e item format znd levd of 

fidcTityl 




ir>dicators are 
of the tranees* 



conditions, and 
in prectse, operations* 



□ 



18. Specify whether items will reqinre product 
measures, process measures, or both. 



^3 19. Devf^planfor ftsmncr^ir^ff 
appiopilatft. ^ 

□ 20. Spec^ muJt?;^c conditions for testoj. ' 
o 21. Drtsrmine namber of hsras to inshri* in 

22. tonplfte test pian wDfkxhect tkKzxmcnung 
^stpitan. 

□ 23. Vi^rite test (tens based on test p2anspec25- 

cstions. 

ED 24. Deif^op and document ^rtsntictions^ 
item pre^ntetion and use. 

□ 25. Checktobesure that item pobi includes 

^xxjt tineas many items as ^ plan . 
.spectres. 

CD 2S. Check ihat hems matc^ objectives. 

CD 27. Check that ftcms are dear, unambiguous, 
* easy to ad.minister^ and at the proper level 
of iide^ity. 

(Zl 23. Devdpp 9eneral test rnstrtictions. 

CD 29. Chedc ^at general instiuttiom are ^ 
dear, unambiguous and as brief as 
poss^bte. 

□ * - 
30. Sdect an appropriate sample for rtem pool 
tryout. 

CD 31. Check that item pool tryout sample is 

composed of "masters" and *!non-masters.^ 

CD '32.' Check that tryout sample size is at least 
50% larfiM^an the number of items. 

33. Check^at tryout samlets random. 

□ 34. Conduct item pool ^out. 

□ 35. Conduct an item analysison tryoot re- 

sults. 

36. Obtaan feedbadc from indrviduifs in the 
tryout sample. 

D 37. Becord comments from peer review of 
item pool. 



140 



□ 

mlfsm'tfwoi insmpocL 

t 

□ 

pad iy/ subject n»mer«p«ti. 

□ 

tryout fc e dbj t ^ arid vanousm'ieiw 
of itesi pocA «/ttn3?5)0/ ^^ifir Saw- 

□ 

^mmary SfMet » an «idL 

□ 

□ 43, Oeitfrftematefomis of ttstH appro- 

j I 44. CKeck ihct crmrocwnsntal, perKnal ar>d 
tejt&r varices arc staa d ardL 

\ i 45. Admlnrettf the CRT. 

d 46. Score the CRT. 

I i 47. EjtiStish cut-off scorw. 

j ) 48. 4?epori test results. 



1 I 43. C6aBsttes^re?ertr^^lb^*^tydJ^tloa 

□ 

□ 51. Cbeck that ^ is greaser ih»3+3£L. 

52. A25«5ContcntyaSid(ty of CRT. 

□ S3. SeSaa 4n pp« ale *^o^er rotssjrt*' for 

corcurrerxt/^predctfve vi6da5on of ^Hl. 

ri 64. Oto'm a regative^ targe^ / gg e&s ntzi^ve - 
safn;:^e for use in ei'a^sa^ng ui. tza i ent 
ar^d/or pradcdve vaJidciy . 

□ AdrnifiiswCRT and other meecire to 
tasple coocarren^y or, after appropriate 
irtterral, predTctrv^. 

o 55. Cakxilate^ as aa estimate of^consurrent 
and/or precfictrvc vjKdity. 

□ 57. UoeSfy tea. to increzss reJijfcQity and/or 

raTidiiy, if necessary- Fpiowiogsuda 

vifidSty of test. 
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Y&i sfeuld use *^}is xAte&^ist to calp yea ^3§ake iMs 
i6icfi hays a3r^y amsrtmtthi.- ThH cbstklist will fcglp 
detensiae the syitability $!f CSTs Khlch alresjfy exist, arjd 
'i0i*cH-ycsr nay idsJ? to 2i5spt f«r jr&ar own t^tlj}5,«ee3s. 



I 



This chsc^s^ist cimsfsts of 3n oriered-^^tfis of qsestac'ss 
ask »^eQ evaltatijig a CJH". Sons of fiWse ^;>jssticns f^tMn 




^4 . . - .... 

0ther questions concern D?T ase: , Tb ansi^er t3;ese, yoa mst 
know tlie objectives, inteodesS test pcpulattc-n, practical con- 
straifit data, j-el?^bilitjr aiid raUdity sstaiates, etc. S^i,- 
t>efcre using tftfs cbeckifst jto eyali^ats.a CRT, collect 'tl5& 
docunfentatlcn-tfcat was crsejf to dev8lop:it. e 
■ - ' / ■ ' 

■ Circlqfthe *Y" Ifesl^ a question if the ajiSwer Is- "yes.* 
:A1so circir"y" if/efce ssestim Is not applicable to tf^e test- 
If tiie ansVer Is ^no," *'can'-t telt," or "eartiy yes, partly 
•Tjo,* circle, tti^ next to tise question-;/ HTien you hate cchij- 
p}€ted the -checklist, the circled "Ss" %f111 rep^^sent airecdrd 
of pns particular aspects of the CRT that^y f5sq«1re'2irpgred1«g 
or Qjst require furtfe^ Infornatlon "before beijjg evaluated. 



CHECKUSr FOR ^ALUATIKQ CRTs> 



Y ^ Qpg ^tc^ cfcijt cii vecifl for pe rfoCT yce on 



l< Y 



« y 



« Y 



8. Arg p e^at tey cejod^atoatsg^^ 
hador? 



conditions ndstiTkdardsipfiKSfid 

H Y 9. Are c^i^esum free frwn ir);»ci of ^senous 
iTOt fccpkt excesshre tkn^ n^iQpomr and 

frcKa ^ «ittrc.;>3p^sti«i of «6;9c;tnr» 
' 3vail^le for »tin^ 

8 Y 11. Are ihci!i^enu j^«er« tested unww 
of ihe ss:Tp5e of ftecns seUctsd for ^stin^? 

14 Y tZ Does the itCT format seiectadbe^jppfox^ 
rriste ^ bflbB^kn- ^>»5ed by the c^^Jecdvft? 

H Y 13L Is the cnsas^srement us»d the S8n» as ^it 
fifeitSj is raqjfr^ by ^ objective {product 
cneasurecDenty process oeisuienierit or bo^}? 

H Y 14« Has the pos^bilrtf of fs^ftg mors been hexd 

N Y 15 lstheite:Q^>rmatetthehi#)estWriof 
. ^ddrty practice? 

A 

^ Y 16. lfiten)$annpiingiMti»ndfcg^^ 

p(aa;,i»s the appropriate curhber of itenu 
- beeainduded?^ 

ft Y 17. ^s ^epcr fo cgiaficebe3D9 tested under ig| 
coodr5or»or; if ftis notpos^blfrto test 
upderan^OQ£5ora« 

mmbefofcoridrtaofy {arid the a ppr op r ate f 
'oocs)7 



1^ y 

yt Y 

K Y 

w y 

h Y 

« Y 

-rN Y 

N Y 



22. 



ilt>}y esy to«dn3tnste& 



N y 

N Y 

N Y 

N V 

N y 



23. Are asaa « the j pgeapc to ievri of fii&^l- 

24. Hsthetans^ageo^theCBTitKnsbesnl^iprt 

sisrplei? 

25. ls^nudentriSorxnedaBtoiP^)ethef speed 
Of jccaracyjft ^aoT5 iEapwtrrt? 

« " - * " 

25. A."e graphs^ drln^iyagd pb to >y a p h s 4aed 
mtennecesmy^deareocKny^isE^o:^ • 

27. issbemtpresent^i^ainray M^uc^pef^er 
^ves the stuOesTtiunts^ oor {Hikes 7t«Sictr^:n£^ 
afficuft? 

23. A^instrtic^onscocicron toaHiterajrid^ 
ad ^9e:;terai overaU test Iristroctki^ 

23. DogestB:tfinmctknsior the test 

the fcCiOwinsiQifonne^om purpose otibe 
tes^ t^TDe HfDiiS'for ^he test* ds$cr^x590 of 
test itandar^ d^cnptkm of test i^erns« 3r)d 
genenStestregsiSsiions? ^ ' ' - 

30. Daspec^instrxjclxnste^lthe>traineecx«^ 
fy lA^m ibe perfmnrce, ccKidhJons and 
srtaridards «i ^ ^ f^d? 

31. Ar^^dear mstructiom provided to :heex^ 

-> * 

ZL Hart the rteffns,been*1riadocnf7 

33« Was an.apgrppdcte^snp^Qsedinihe'iry- 
ojt? ' : 

34. Was^ trydut»r»pie cocrjpoaed cf 
^Aiacs^enl^ ^Doo-foictiet^? ^ 

* than the nund^erof -itects?^ 



M y 36L Was^typutiaR:)p{erar>d6m7 
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* » - 
ilfirr&saftdt?'*' " 
^ jaj^lPrigf i?» oiKf t gugT^ ed ^Ttc^oni? 

W Y 43. the «oria3pco5edMr» dear?' 

cordios test results? 

H^y 47« Has^possibC^ ofspociii Pfot^enabean 
ta kfn into arco unt? 



Jft y '43> HattfbaapuBijSciljaeadfetjQ^Z'aitrirt^^ 
4CSy0|iel5^spBKBr:£xa34,^ - 

K Y Si9Kk«iesas:^U3^$^a>«^f£a^ 

^ laa^t'jpd^ieg fci fsl ? . 

«. - ' ^ - 

li y"SiJSfBniAe.caaiwanaii^ 

!^ y 53. %to^£e»jpvBnck^^y3efier^ 

K Y ^ leu tgfili desaorgtreta3 

a c £K ma ; n yiMry<£iBag 

^ y 53. Has tba sesttieen dauKxa-tiatadyi^^sbrsiu^ 
14 Y 57. Areyo;i&o:oj^i^c5nvbsed^8^dK test 



AcSiemjent Test - A test for jasssizHog *h fndlvidjal's level of msisrj 
of a s^£)iact. fcr exasirple, in ediievenBt test in^ be gives ca 
4t!3 srade nathenatics t» see if a sJ^eat's itatJaiatlttl alrfllty 
las- reaciied tfee 4t3i levels ^l^^rtt grade level* ^ay l>s de 
. flD^ in tsxvs of tSe avenge 452) gr^er^'s s^sres^ 1q ifeJdf . 
* case tJje test loald be Dcrss-referen^d^ cr fn terarof 

s^'idards ffer 4ti graders, ifl i6idi <^se IS^e ediJeranent tast 
ifould ^ erf teHo3-refer«ced- 

Aptltode Test - A tast to detenrfoe an indivJd:ja1*s learnlrg c^>Ebllity 
. ' io en area of instroctf on. For e^^nple^ a test of la^feirical 
apt1t2s:ie mild tneasune people's ability t2> leara to perffera 
tasks JnvolvSng laecJ^lal skills and.knoitfledges^ rat ti:eir 
present ability .to peHonn inedianicel tasks* 

Conditions -.One of t*e main parts of an objectiw tfcit tells: ^ Kfaat 
' the sta>dent has to mrk witJi, Z) the environnerital dram^ 
. stances uhd^r ^id? t^>e perfornar^ce inust be denofistrated, 
3) \vhat tfee student itntst ^rk 4) his sfertllig points, ^ 
and 5) ar]y limitations, special instn^ctiqas, etc^ 

Course Criterion Test - A test given at tJie end of a coursfi to'deternrine 
if t?#e st?jdent has readied t33e f)eces5a?y criterion levels for 
the subject beiog ta^^t. Coarse <xlterlcn tests are keyed 
to tJie course objectfres and represent a "finfil exajn"* on ineetiog 
the standards ^cified in tJse objectives* , 

Criterion - Synonynous with standard (Hie part of the objective by nfeidi 
the perfonrance is evalusted)* for exairple^ part irf liie cri- z 
terion by %^idi "donning a gas rtask" is evaluated, is ^at tJie 
perfbTOanra be ranpleted io nine seconds or less.. If it ^ 
trainee ten seconds to don tJje inask, he has not achieved She 
criterion level of perfonnancs. * - 

Criterion-Refer^ced Test (CRT) - A C3?T neasures ^at an individual can 
xio or knows, cont^red to i*at"1ie mst be able to do or imist 
know in order to successiftilly perform a task. Here an indf- 
vidual 's perfoHiaince is coii5>ared to external criteria or per- 
i^nsance standards 'which aii derived frm an analysis of lAat 
is .Inquired to do a particular task. 



J 

CHtidl Tasl:? - A task that msperfDnrei could le^i to loss trf -life 
cr property, cr to irissicn fa^lyi^. F^r e/^nple, i^f nany first 
aid proreiares, treating fc^ ^^ock a crlt^^cal tasi:: Evefi "ri^ ^ 
tfce ot^ier parts of t*:e proteiure are 'os^Ktlj' perfonred, t^ie 
Jniiritjal rray 4ie of shcci;. ^^s^lagmg a vc.:r43, w^jfle iniporta^t, 
io:jld 'jsjally nst be cro^^^^iie^ea a crn^cal task. 



Diagnostic Test • A test used to infcrtr a student of Ms progr^s, to 

deternrine if liis behavior t?iial1fies hit? for*coi>rse enfy* to 
estate! ichat cbj^t^ves or steps he fs weai: cn. for &srple^ 
ir SCT a diagnostic t^t iV usually given tefore t^e corrpr^^rt- 
sive pe^orrat^ce test •^'C^;--t^^s, t^-^e tzut^^z gets information 
KT.at ^e needs ti iirc-^te befc^e tai^'^ig t*"e C^. 



EntO' B^iav'or - ''^♦e pe^^onrarce w^i*c^* a student ca^^au'e^c^- a ce*"- 

tai^ suDje^t HBtter jpon ente'^'ng 3 co'^rse d* i^Tst^^ction 0^ 

that 'sjtject. Entry oeha^^c- irtsy refer to skills, r^ro^leSges, 
and atftuSes. 



Error of Central Tendency • A ratlrtg error ir whicti different raters tend 
to rate most students toward tne r-iddle of the scale- Thus, Jf 
there is a 'nejtraV point on a rating scale, raters inay tend 
to r^te fTost stuoerts close to u. 



Error of Kalo - A rating error r»ade due to an observer being biased about 
an individual. Th^s inay be caused by an observer a,llowirig his 
general iiupression of an individual to influence His judgnent. 
The resulting shift of the rating can be toward the high end of 
the scale (positive halo) or the low end of the scale (negatiiee 
halo). 



Error of Statidards - An error conrnitted in ratfng due to differences in 

the observers' standards- One rater's standards might be^'higher 
than another rater's. Thus, while one rater jnight rate a- "person's 
perfonrance as ^'unsatisfactory,* another rater eight rate that 
same person's perfonnance as "s^atisfactory.* ' 
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fidelity - rbe extent to itfhich a CRT resenbles the actus! objective {or 
perfonrance) being tested. The ncre the CRT resales the per- 
' foniarce In q;est1on» the higher ti^e fidelity of the CRT. for 
ex^nple. If you tested a person to see h3vr i^/ell be cojid bandage 
a yiO'jra by o-bservlng hi is tarrfaglog a ^cuisl, the test vojld have 
high fidelity. If yoj tested htir by asfeing hiiP to answer 
ouUlple-cholce q^^e^'^orts on how to bai^ge a iMDjnd, to? test 
voiild have low fidelity. 



Forrot • The type of test or Iteu orgamzation. Exanples of ItesB fomat 
*1nclJle paper arrf pencil tests, bajids-on perfomance tests, 
nultlple choice tests, recall reasirres, job slinulatlofis, etc. 



«3ndsr0n Perfonnance Keasure • A t>pe of perfomsnce neasure Kifbere the 
individual n tested on the appaafitus for which he was trained 
(no paper- anc?^ pencil tests). A hands -^on perfonrartce ^neasure 
of generator repair wDjld reot^ire t^te trainee to actually repair 
a generator. 



Indicator - The action verb of the objective's task statenent throiJgh 
^ whic^ the ability to do the perfornance specified by the maSn 

intent is inferred, \*^:en the main Intent Itself Is not directly 
<obserYable. For example, if the ?ia1n Intent Is **D1scr1i3icf3te 
between shears usea for cutting a straight line In tin aw those 
use^ for cutti^ a curved line/ the Indicator infght be **by 
circling the picture of shears used for cutting a curbed llne.*^ 
fete that in this case the nain intent--*TJ1scr1m1nate''--1s 
covert; that is, it is not directly observable • Thus, an Indi- 
cator had to be added. 



Iteip Analysis - A technique used to help spot bad Items- A nuniber of 

techniques can be used to do this, all of which use the follow- 
ing principle: Acceptable itens discriminate between **masters" 
and "rion-nasters/ Unacceptable items are incapable of naking 
such a discrlini nation. So, in itea analysis, you look for itenjs 
which ar^e missed by "non-rasters" and passed by "casters.* 



Itea Pool - The total set of 1te:ns constructed for a specified test, be 

it a single or cjultlple objective test. Jbe itea pool Is reduced 
by Itea analysis and review techniques to" yield a final version 
of the test consisting of the best Itecs fn» the pool* 
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lea.rni»g Analysis - An aralysls of the steps necessary to cbtaln the 

objective, the Skills n^ed to learn the cateHal presented* 
etc. In a learning analysis, you detemine vSat skills, knowl- 
edge, and attitudes iodivlduals mst be taiight to get tbeiE froni 
their entry bebaviors to tee be^viors specified by the learning 
objectives. 



Learning Objective - A leamirg objective describes vffat the ir^ividual 
jnust koM at>d be able to (So at the annpletion of training. It 
ir»ay be the sane as a perfornance objective or nay be l^s rigor- 
ous with respect to conditions at«l/or stan^rds. Tjhos, a 
learning objective tells you ¥t;at the 1ndivi<;u3l should get cut 
of ^ainiog, rot n^essaril^ K^t he niust be able to <S3 on the 
, job. An individual ray r^uire further training on the jcb 

after he has achieved a learning objective, before he is able to 
neet a perfon^nce objective. Learning objectives, like all 
objectives, hat'e three irain parts: perfornances (tasks), con- 
ditions, and standards. 



Logical Error - kn error in rating u^iicb may be due to an observer giving 
simlar ratings to traits v^^jich aren't necessarily related. . 
Two or nore traits being r-ated at the same tine iray logically 
seen* related to an observer when they really are not. For exais- 
ple, a rater nrjght score a person siinilarly on ^follows orders" 
and •'canpUtes «rk on time** because the traits seen? logi- 
cally related, even though they are not necessarily related. 



Kain Intent - The stateient of the task that tells you what tbgobj3:tive 
is iiainly about: The skill or knowledge the learner, is to de-* 
velop, or the perfornance ^ich is the purpose of the objective, 
A niain intent nay be overt (observable)— for exan^ple, 'disass^ 
ble a K-16"; or covert (unobservable)— for exainple, *know the 
differences in appearance betwee^^ poisonous and nonpoisoKms 
snakes-" If coyei=^, an indicator misj^ be added to the objective 
to tell you how' to evaluate the na>pr^ntent. 



Kastery - An individual has aUafhed nsastery when he has con?)leted 
training segment that your CRT was developed to test 
passed the test/showing that he can perforo at th 
level necessar^for success fur task completion, 



inters - People ifio are conpetent at perfcmrifg a qIv^ tasi cr %ch0 hays 
alr^y canpl^ted the training segnant tlat a CRT is beirg 
ilevelojied to test- A uaster can perfonH the task(s) for wbicJi 
be has been trained. 



]feD-}festers - People are rat conpetent performers, or are rat 
knwledg^le in the subject ratter being t^ted, or yiho have 
not had appropriate training. 



I^ms-Referenced Test (KRT) - An approach to testing In i*1ch an Ifdlvld- 
yal's test score is conpared to the scores of other Irrfividieis 
* regardless of standards specified by an objective. 



Objective • A statecr^ specifying skiUs and knowl^ge to be tested. It 
consists' of t^ree parts: 1} perfon:arK:e (task), 2} conditions, 
and 3) standards. Thits, an objective stat^ >mat ^ist be done 
<task), the conditions under v^jich it must be done, and how 
and/or how <?yickly it inust be done (standards). 



Percentile • A value on j scale of one hundred that indicate the percent 
of a distribution t^iat is equal to or below It. For exaniple, if 
a. person scores at the 95th percentile, this laeans he has done 
belter than -95 out of 100 people vrfw have taken the test. 



Perfornance - One of three nain parts of an objective v^ich states pre- 
cisely what must be done. Every statement of performance in- 
cludes an action verb. Soiietines this verb Is the perfonnance 
itself and sometiines it is an indicator of the perfonnance. 



Performance Keasyresaent - The nethod used to ascertain vrt^ether or wt ah 
indiviclual isas achieved the specified criterion level on the 
perfonnance of a particular task or tasks. 



Perfornance Objective - A perfonnance objective is derived froa ah analysis 
of ^t fDust be done in order to persona a task adequately. Like 
any objective, a perforcarice objective has'three nain parts: 
perfonnance (task), conditions* and standards. A perfqnnince 
objective is the highest level of objective-^it tells what cajst 
be done in order to perfora a task $iK:cessfuny* 
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Perfcrnadce fets - A perfcrcanca t^t neasio^ the indivldjal's ability 
to perfona a particular task cr gro2p of tasks* *feft-*e-<b the 
task properly cr tK)t?* is tt)S question that a cnterion- 
ref^^Qced perfiJraarxe tst sed3 to aiiswer. A nonD-referenceJ 
perfornance t^t inv^igates tew wall ah f^fvldua? can.per- 
fcrn a task *c€K3?ared to other peopll* ^ gerfbrnaoce test »n 
be adnrinister^ £3ii>g actual tends-cn performance, slnrjlated 
perfoncance, or in a.japer-and-p^il ionat (if the perfonnance 
in question reqafres use of paper-and-pencil— calctflatirg azi- 
for €X2n5)l€). . 



Phi Coefficient (V) - A siinple statistical ta±oique which tray be esed for 
CRT itea ajsalysis If tJis followiig data are avaitlable: 1) iitncJi 
"people pass vhrch itesas, and 2X^**ich people are "iiasters** and 
vAich are *nsn-casters.' ^ 



4 = 



V(a4«Hc^o)(a-k:)(S4^)) *^^7 



A - tmb&r of *i:ast^* v^o passed the it«p \ 
B = numfeer of *tej&€ers* who failed the itesi 1 
C = Tmber of *i«>n-ia$ters* who passed tbe iteii\ 
0 = nun&er of *non-^jasters* who failed the iteni 



4^. nay also be used as a iieasure of test-retest reliability and 
of concurrent or predictive validity. For such uses the fomila 
r^ra^ns the sane, iMrt: the letters refer to different tneasures: * 



Test-Retest Reliability 
1st adsjfni strati on of test 



Concurrent or Predictive Validity 
CRT Results 





Fall 


Pass 




fail 


Pass 


~?as4^ 


B 


A 


Acceptable - 


B 


A 


2nd adarn- 
istration 






Concurrent 
or predic- 
tive 






of test 
rail 


i) "i 


c - /■ • 


fneasure 

Unaccept- 
able 

# 


0 

— 
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Pop:Hation - Tfce cniyersal set of indiyi<i^ls i&o posses the character- 
. istic(s) In qoestl^m. For exaiple, tJ^a fop:i1at1on possessing 
the diaractenstic *3ives in tfce U.S*A.* is ihB papiilatfcn of 
^ the V.^A. The pcpalaticn*of 1iy|j3g tJ*S. citizens iocij^J^ all 
^ p^ple pQssesslig U.S. cUlzensbip %6eti5er or rat they live ip. 

t^e U.SJL The fK^jlation possessing the characteristic "j^sed 
Arny BCT <^^r1n g the last y»r* inclods all Army personnel -i^o 
3'. Aave^passed £CT in tfce last year. 



Practical Constraints - fact^^ such as time availability, staxipomr ^vail- 
abfiity, ^ts-r^^ter^ich nay ingjair adnrinistratlon of test 
ite^4fl conditions and stars^rds re:iain as pr^^ly specific 
^ in >M objective, for exanple, an objective requiring tlje firing 
^nuclear projectiles nay, i^ll have practical cc^nstraints—the 
.objective wduW have to be irodified so that the test item could 
X substitute firing *dunoy" nuclear projectiles. 



" Process Keasurement - Keasarenent of a pnocess rather than a product. 

fVocess iDeasurerent is indicated vhen an objective specifies a 
seQuaice of perforoances which can be observed and ii^hen the 
perfonnances are as icrportant as the final product of the per- 
foraances. It is also appropriate irfien product cannot be distin 
guisbed from process or when the product cannot be iDasured for" 
safety or other constraining reasons. Process oeasureni^ 
usual Ij^ requires observing K'bether or not a perfomance is done 
properly and/or quickly enough, and in the right sequence. An 
example of process mrasure'tient is scoring a person "90*' or "no- 
go* on his ability to properly execute an^^about face'' in drill 
and cercnonies. 



Product Keasurement - Keasuresient of a product rather than a process. 

Product oeasureiient is appropriate if: 1) the objective speci- 
fies a product, 2) the product can be measured as to either 
presence or characteristics, and 3) the procedure leading to 
product can vary without affecting the product. An example of 
product reasurecnent is observing a weapon to see if it .has been 
reassembled correctly— here, you don't need to watch the weapon 
being reassembled {the process) because you can observe the 
product to see if it has been reassesiled correctly. • 



Randoa Sac?)le • A sample in which the individuals chosen froa aoong all 

• available people of the appropriate t^e are selected by chance* 
A randtxa sarnie of a population would be coqx^sed of pec^le 
. possessing the characteristic of the population*, each of whwi 
is equally likely to be chosen fn^ the jxjpiitatian^i 



HztifQ Scale - A iferice iised to evaluate sriiievenent. Khen iislog a ratii^ 
scale fcr scoHog, >dj sbDald specify tfee rating a student nrads 
to itbiffife cHtericjJ level fcr «ja psr&rcance specified l?y the 
<^iective. A rating scale might also be used to assess entering 
beSaricr at the stert of instrccticn. Rating scales usually 
have tfcree to ni«s points on tees representing levels «f perfcr- 
jjsanoe frosf Iok to high. 



Reliability - Reliability is a syncnyra for •consistency". or *r^>al2bil- 
ity." A test is asnsida^ to be reliable if It saJces the t^ne, 
discriuiiaticns ansng iiKJiridaals cn imiltfple occasitms. People 
■ should score about the szxae ^di tine S:ey take the test. If it 
is reliable {asstiniing that they don*t learn or forget betweei 
tests). Thus, a person's scores on reliable tests are consis- 
tent and repeatable. ^ 



Repertolrfe of Behavior - The group of b^aviors viiich tt*e stadent is cap- 
ablfi of perfoming. Different groups have different repertoires 
of behaviors, for exairple, soldering connections is a part of 
the repertoire of b^vior of electronic technicians, bat proba- 
bly not of food service specialists. Kultiplyii^ two single- 
digit ntcifcers Is part of the r^ertoire of behavior of iiany 10 
year olds, but rwt of too nany 7 year olds. 



Representative Sample - A representative saniple is one which reflects 
(reprsents) the population for which a test is intended. In 
order to try out test itenss on a representative san^le, the per- 
sons in the sample should be similar to those for whoa^e test 
is intended. Thus, if a test is intended iisr^jseoole vrfto have 
conipleted BCT, a representative saniple would be conjposed of 
people who have cos?)leted BCT. if a test is intended for people 
who have conipleted a field wirenan course, a representative 
sainple whould be con?)osed of people who have cfispleted that, 
course. If a population is sanpled randomly, the resulting 
group will be a representative sainple of that population- and 
not of any other population. 



Screening Device - A device used to screen out trainees who dp not qualify 
far the training course being considered, either because they 
are already masters of the subject natter or be<^use they do not 
have the entry behavior required for the course.. {A CRT can be 
used as a screoting device.) 



_161 
c-a 



Slccilation - A situation where phenomena likely to occur io actual perfo^ 
ranee can be reproduced ufider test ccnditiorcs vrittiout using the 
real -life equipment. Sinpjlaticn can use conplex slinulators—a 
si inula ted helicopter is an exanple — or sunple si^irjlators^-a 
rubber bayonet is an exaniple. 



Skills • A leam&j ability to successfully perform ai^^rtain action or 
related group of actions. VTnile knowl^^ is\»ften necessary 
for skills, the kno^^ledge of how to perfoV;^an 3rt is not the 
skill- >the perfonrance of the act is the s^k^1^« Riding a bicy- 
cle, for exainple, is a skill requiring perTtJWjKe of a related 
seq^^ence of actions. A person nay have knowt^ge^of how to 
ride--he could tell i'ou how to sit, pedal, shift dears, brake, 
etc. --without possessing the skill of riding. \ 



Standards - The third main part of an objective w'hich specifies the cri- 
terion by which the performance is evaluated (how well and/or 
how quickly a perfonrt^nce must be done). There are several types 
of standards that may be included in any objective, any of viliich 
tell how well or how quickly the task must be done. Aji objective 
may have both a standard of quality and of speed. 



Subject Katter Expert - Someone who is well qualified in the subject matter 
being tested. The reason for having such a person review the 
test iters is because the test developer tray not be an expert in 
the subject. A subject matter expert is usually trained and ex- 
perienced in a particular subject area. 



lasi: • A part of a job that requires certain performance {$)* A group of 
tasks comprise a job, while complex tasks may be broken 4own 
into subtasks. The job of auto mechanic, for example, is com- 
posed of many tasks including tune-ups, repairing transmissions, 
replacing brake linings, etc. The task "tune-yp** is composed of 
subtasks such as replace spark plugs, replace points, etc. The 
designation of tasks is often arbitrary. If, for exacple, a 
person's job was "tune-up specialist," replacing points would oe 
a task rather than a subtask. Subtasks under ''replacing points" 
would include removing old points, putting in new points, setting 
gap on new points, etc. 
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' Task Analysis - An analysis of a task (or tasfcs) to detenrine the skills 

and knowledges necessary to perfons it, ©juifment andA>r facili- . 
^ies required, attitydes required, critical tasks, proper se- 
quence of actions, etc. Sornetiines, all the tasks in a given job 
are analyzed by a procedure called 'job task analysis*' or 'job 
analysis." Often, task analysis. is used as a synanyn^ for job ■ 
analysis. 



valuation Unit • A group of people who are experts in the area of 
testing. Test evaluation personnel are often expert in educa- 
tional ^chnology--they can be of help vrith irany traini-ng arid 
testing probleins. 



T^st-Retest Reliability - Detenrrinanon o^' the stability o^ test scores by 
repeated testing. Test-retest reliability assuires tnat no 
training or forgetting takes pla^ce between test adniinistrations , 
so both administrations should be given close together in time. 
If a test has high test-r^test reliatility, & person should 
score about the same each time he takes the test. If it has low 
test-retest reliability, a person*s score may vary widely fro:r. 
one test adir4nistrat1on to the next^ 



Validation - The process of detenninirfg >^t>ether a test actually measures 
what it is intended to measure. 



Validity, Concurrent - Statements of concurrent validity indicate the 

extent to which a test may be used to estinate an individual's 
present standing on the criterion. This type of validity re- 
flects only the status quo at a-^particular time. In concurrent 
^ validation, individuals- scores on the CRT are correlated with 
their performances on another measure of the objective(s) in 
.question. If people who scor^ high on the CRT score high on the 
other measure, while people who score low on the CRT score low 
on the other measure, the test is concurrently valid. Of course, 
the other measure must be a good one or the concurrent validation 
won't mean much. 



Validity, Content - If test objectives are based on an. adequate task analy- 
sis of what the individual must" do, and if the test Itens m^sure 
exactly what the objectives say they should, the test is content 
valid. Content validation is especially appropriate for CRTs. 
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Yalidltv, Predictive • StateiBents of predictive validity, indicate the . 

extent tp which aci individual/s future level cn a critencn can 
be predicted fron: a koDKledge of his test perforne^ce. CRT 
scores are correlated %oth another neastire Bf the sane ^perfor- 
nance utich is taker* later^ the job. if high s«res cn the 
CRT are correlated ulth success on the job, tdiile low scores are 
correlated vnth lack of success, tiie CRT has high predictive 
validity. 



APPENDIX D 



SpliARc ROOT TASLES 



Hw To Use the Sauare Root Tables 

; For nuiT^rs 1 to 1 ^OOO : In column fl, locate the nunfeer for wirich 
yoa *<ant the square roOt»*arid iimediately to the right, in ColusffiVW , 
yoa will find the answer. For example, the square root of 150 is 12.2474. 

For nunfeers 1,001 to 100,000 : (1) Take the nunber for which yoa want 
tie square root and rove its declnal point tf) places to the left. (2) Roand 
off to the nearest whole nun6er, and find this number in Zoium fl. (3) Take 
the minfoer iinmediately to the right, in ColannVlf , a«i «5ve its deciial 
point one place to the right . TTiat is the s(?uare root. 

For example, suppose you need the square root of 1,2)0. First, ooye 
the decical point two places to the left. Since this gives yoa "12. Off", 
no rounding is necessary. Then look -up the square root of 12 in the sqaai^ 
root table, and you find "3.46410". Then rove the deciiial point one place 
to the right and you have the answer: "34.6410>' 

»' *-■ ■ at ' ^ 

In some cases, there wiH be slight rounding error, but this will mt 

affect your computation of . For example, using ihis procei&re, you 

would find that the square root of 9,912 is 99.498/, when it is actually 

99.5590. The difference~0.0603r-is insignificant. 



For nu^ers 100,001 to 10.000,000 : (1^ TalTe the number for vdiich you 
want the square root and nove its deciral point four places to ^e left . 
{2} P^und off to the nearest i^le nt£±er, and find this nur^er in Column H. 
(5) Take the mdber ioEiediate'^y to the right, in Column VH", and rove its 
decinal point two places to the right. That is the square root. ^ 
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FXYIEK OU£STI0KS kKd A?£S(E?iS 



Frederick Steinheiser, Jr. 
U.S* Ann/ Research Institute for t^ie Behavioral a.nd St>cial ScieniS^ 

TMs Appendix contains a set of questions and answers for each chapter. 
This is rot a set^of t^st iters. Rather, it is siggested that you atteirpt 
to ajTiswer eac^^ Question for a given chapter after reading that chapter. 
you can then check your answer with the supplied answer. 

In nan/ instances, the questions and answers supplement the material 
provided in tne cnapter. Her^ce^ *t w^»n be a 'learning experience* for 
/cu to study these Questions and answers. A few questions were designed 
to be thaug^it'provoi:ing, ar»d will require sonre creative insight and 
application o^' the inforration fumi^hed in the text. 
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REVIEW PfiOSLEKS FOR CRT KAMJAL 



Chapter 1 



1. 



(tee of the ifportsnt differer^ces beta^reen mm-referenced tests end 
criterian-referen«d tests is this: ar^ KRT has rostly kriOwl edge- type 
itens, whereas a CSTT has cainly perfdnnsace^Ja^ Items • (For exai!9le, 
ifritloQ dotfn the steps in cleaning an H-16 vs. actoally cleaning it 
properJyTJ trae or false? 



2. 



K) studCTts vent to the rifle range, and each shot 20 rounds, 
spread of %cores looked like this: 



The 



tkxnber of students 
getting this number 
of direct hits 



14? 
12^ 
10! 

8i 

6 
4 
2 
0 



12 



13 



0-2 3-5 6^ 9-11 12-14 15-17 18-20 
Kumber of direct hits out of 20 shots 

To hsipym in rading this graph, note that 4 sfeidents scored frm 
^ ^ ^ 4iT^ The instractor decided after the exercise to 

^jcerpt a^B top 2K of the steidents froni furtter practice, while the 
i^tttm SOS had to stay for irore drill. Jfaw rany stiudents had to stay 
for mre practice? Is this naricsnsanship t^t an exanple of a CRT or 
KRT, based up^p the Instructor's scoring procedure? ^ 



3. Ifs often helpful to plot 
Wsual inpression of the di 
froHJ an KRT is often quite 
In the distributions below, 
KRT, and which fro:3 a CRT? 
below, tell different stori 
the KRT or CRT distribution 
(c) What are sonve possible 
for the differences in the 



a graph of test data> in order to get a* 
stribution of scores. Tfee distribution 
different from the one of a CRT. (a) 
which one{sj do you -think came froni an 
(b) The three scores of 30, 50, 80, shown 
es, depending upon whether they relate to 
{sj. How night you interpret these sa)res? 
reasons (think about both training and testing) 
shapes of the CR and KR scores as shown? 
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getting a given 
score 




4. In conpariRg a large njirfcer of scones on a CRT before end efter training, 
the 0?T is being ysed (a) as a diagnostic aid, (h) to evaluate tiie 
iftstnictor or prograni of instraction, (c) as a screening device. 

5. A stodent got 901 of .Ihe pn^lesns on a math test correct, so he was 
advanced directly to t^B computer course witteut havlnq to take a 
rath refresher course. This tnath CTT was used (a) as a diagnostic 
aid, {b) to evaluate the instn«:tor or progrm of instri^ctioft (c) 
as a screening device. 

6. A studfent passed every itesn on a test except one. He %i?as then allowed 
to enter the instruction program at t^^e level of t^he test item that 

he Hissed. The information fron) this CRT vas used (a) as a diaq^stic 
aid, (b) to evaluate the instractor or prograir of instruction, ^c) as 
a screening device. 

Chapter 2 

1. Hitting the outline of a^rrovlng enemy tank with an anti-tank r&und 
is an example of a level one, level tw, or level three <*jective? 

2. Hitting an ensry tank in actual cotfibat with an anti-tank round is an 
exainple of a level one„ level tw, or level three objective? 

3. Hitting the bull's eye of a stationary circular target with an anti- 
tank rouTid is an example of a level one, txo, or three objective? 



4. It Is possible that a poorly specified test iters given after one phase 
of training rright really be properly specified if given after another 
phase of training. True or false? (Hint; Think of the l^pe of 
instructions or infonnation given to soWe a problem in an Introductory 
vs. an intermediate course.) ^ 

5. Hatching. Hatch each example vrith the appropriate technical tera. 
The most significant parts of soto examples are underlined. 

a. Performance b. Conditions ^ c. Standards 

1. An action verb tells Khat is to be done by the stydent. 

2. The task oust be perfonned to a satisfactory criterion level. 

3. The dial setting oust be correct, to the nearest 1/2 degree. 

4. A student has to tune a jeep engine using only the tools provided . 

5. An indicator is essential in order to ueasure ^ie" rain intent. 

6. Just because a student can pass a hands-w test in the jclassrooa 
does not guarantee that he'll be able to pass the saine test in . 

s inflated (or real) coribat. 
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6. The use of 'uiiitar/ objectives* (e) reoulres that all teste be inde- 
pent3ent» (b) Is' the inplesrentation 3 Level One objective (bjt 
not Level Two or Three), (c) n^ns that you don't *«ve to divide 
objectives into Perfomance, Cor#d1tions, and Standards, {d) requires 
perforna.f>ce on nore than one task at a tine. {Jtore than one choice 
my be correct-) - 

7. *Siven tJ^ese pictures of five tools, identify the one used for removing 
spark plut- by circling it." tfnat is the rain intent of the objecttive? 
Vfhat is the inc:cator? vTnat are soire other indicators that could be 
used ^thout a>singif>g the main intent or the conditions? 

8. "Cut a 6 inch dfsneter circle out of this piece of sheet retal using 
the appropriate shears/ Knat is the rain intent of this objective? 
What is the indicator? 

9. WJ?y is it essential that covert tnam intents have appropriate indicators? 

10. In a couple sentences, exols-tn wnat is ineant by specifying perfor- 
mances, conditions, and standards in 'clear, operational terns." 

11. Conditions and standards as specified for a Level One objective my 
actiially be inoraoerly specified for a Level Two objective. True 
or false? 

12. Here's an extra **thought probleiB:" 

Suppose that an instructor decided to test a helicopter pilot trainee 
without reference to explicit objectives. He nerely *Vent along for 
the ride" while the student executed various tnaneuvers of his own 
choosing, and withput knowing exactly which ones he ought to do or 
u^hat the passing criterion was. (This is, of course, a highly un- 
realistic exaii?p1e, but it will help to focus upon sone very realistic 
issues that crop up in the use of criterion referenced tests.) 
After studying this CRT ranual , the instructor thought that he would 
Ife able to improve his test. How r^ight he go about it? (You don't 
have to be an expert in helicopter tentrinology to cone up with a few 
overall suggestions.) V^hat kinds of data might the instructor want 
to record when the student is executing various maneuvers? 

Chapter 3 * 

1. Giving a trainee a paper and pencil test on how to fire a niortar is 
Of higher fidelity than evaluating hin on a dry-fire test- True or 
false? 
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2. At the end of d nedic's tralntnp, the instructor decided to pass only 
thase students who got et le^i ^0 t>jt of 50 peper arrd pencil test 
Items correct. Do you' think that this was a good type of test to 
certify a student as a nedic? kTry or wny not? How wojld you isjprove 
the test? 

3. ""Another nedical instructor decided to give his students 30 simulated 

injuries on duniries to treat, out of the total of 40 such injuries 
that hid been co/ered in the coi/rse, A passir#g score 25 out of 
the 30 injuries had to be treated perfectly. How does this test compare 
to the first instructor's test? iChat Height be done to improve upon 
this test? 

^. Another Tnedica> instructor gave his students all 40 of the in^rles 
t*2t )%d been taught in the course on the test dunsries. A passing 
score viras 33 oi^of ^0. few does this test coirpare to the first Imo 
tests rentioned above'' k'hat right still be done to irrorove this test, 
assuring that rto pracf^cdl co^st'^amts stood in the way? What if there 
were constraints, so that not all students could be tested on all the 
injuries'^ 

5. Which is not an 'objective* test: (a) tpue-false, (b) matching, 

(c) essay, (d) rrjltiple choice, (e) completion or fill-in-the-blank. 

6. Having a person conduct the testing who >»?as not the course instructor 
nay help* to elicinate tne error of (a) standards, (fa) logic, (c) central 
tendency, (d) halo. 

7. Match the^type of measurerrent .with the correct example/^"^- 

a. Process b. Product c. Process and Product 

1. Find out if this battery has enough charge to start, a jeep. 

2. Using -dry-fire techniques, fire 10 K-102 Howitzer rounds for these 
ten target settings. 

3. Using the proper procedures during live fire for the above howitzer, 
at least 5 out of 10 rounds must inpact within 25 neters of the 
target. 

8» Vfhat are s(xne general reasons that nay malce it necessary to rodify 
conditions and standards froni an ideal to a nore practical setting? 

Sr. • Itesn sainpling within objectives (a) ts used where a concept must be 
learned, (b) is used where there is a routine process to be learned, 
(c) reauires that a nijcnber of similar test itens be produced frorn the 
total (possibly infinite) nurrber of such iterns, (d) neans that the^ 
sairie objective should be tested using a ntsnber of different items, 
(e) nieans that the saoe itsss are derived froa different objectives ♦ 
(More than one chert ce nay be correct.) 



ERIC 



175 



E-5 



10* Kfy should both easy end difficult conditions be us^ when -testing 
mcer itiiUiple c^.ditions? 



lU Sgt. Sarftb suspects tJ^t PFC Jones ray not really be'^SW to rsmre 
the spark plugs in one ainute or less, Jones' tines for three spark 
plugs ^re 59 , 58 , 58 sec* The next lowest score by Duncan, i*ose 
tines were 50 , 52, and 53 sec. So Sgt. Srsith singled out PFC Jon^ 
to do a fourth plug reTOval, as an extra (and unplajmed) part of the 
test>. to ^u agree with Snrith's decision? Why or>ihy not? 

55-. HowHany <Jecision points are there in the flow chart on p. 35? 

Chapter '4 

1. What are the specific steps of the Test Plan Worksheet? How are they 
to be used? 

2. Evaluate this statenent: "fiood instructions do not give any hints 

to tlvfe students. The nore t*iat a student taking a test has to figure 
out for hioself about ^e test, the better the test.^ 

3- An inadequate test iten is one vAich (a) 'is of low fidelity^ (b) 
requires an indicator response, (c) is of high fidelity, (d) has 
stricter conditions than those vAich were stated in the objective, 
(e) has good agreenient between the standards of the objective and the 
test iteni. 

hapter 5 

1. In choosing a group of Jion-Kasters, vihy can't you just choose people 
froa any group which has not had the training experience that your 
group"^ Masters has had? 

2. An instructor was designing a new electronics course. He decided 
that be needed 40 itenis on his final exafji. On how many people should 
he try out this version of the exam? How many should be Masters, and. 
how inany should be Non-Kasters? 

3. Continuing with the above exanple, question H on this try-out exam 
was multiple choice, dealing with the voltage drop in a step-down 
transformer? 26 of the recent grads chose the correct answer, vAereas 
6 of the non-nasters selected it. Hhat do you think about the value 
of this itea? 

4. Question #17 was a true-false itea, asking ifi a tunnel diode could 
be substituted for a aalfunctipning capacitor if wired in parallel 
to the nearest transistor; 18 of the recent grads got it right, 
whereas 13^f the non-nasters got it right. What do you think about 
the value of this itesa? 
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5. Question #14 asked 1f boasehold voltage wiS a.c. or d.c; ^ of the 
gr*ds got It right, and 29 of the non-nasters got it Hght. What do 
yon think about the value of this itsp? 

Chapter 6 

For each of the terns discussed in this chapter, select the appropraite 
exasiple or description. There are no duplications. 



a. 


Personal Variables 


b. 


Scoring 


c. 


Fixed Point 


d. 


Go/Tto-Go 


e. 


Hands ^On 


f. 


FaUe Positive 


9- 


Rating Scale^ 


h. 


Familiarization 


i . 


False Negative 


j. 


Assist Scoring 


k. 


Uniforo Instructions 


1. 


Environmental Variable 



1. On Wonday, PfC Jones passed a practice test, which his instnjctor 
said was just like the real one that was to be given on Wed. But 
Jones caught the flu on Tuesday, and still took the test on Hed. 

He failed the test, and as a result was not graduated into the next 
sequence of instruction. 

2. All students should be equally alert, not hungry or. tired. 

3. Tester should know how to give the test, perhaps by having watched 
scKneone else conduct it previously. 

4. Testing with the real device, apparatus, weapon, or rachine. 

5. The student has to^ only those itesns again which he rrrissed, and 
does not have to retake the whole test. 

6. Student either knows how, or doesn't know how, tJiere's no in-between 
"partial knowledge." 

7. Conditions that, if changed frora one group to the nesft, might 
(falsely) suggest that there's sanething wrong or unreliable about 
the test. 

8. Kunibers are assigned to performance on each iten. 

9. If a nunierical answer is close enough to the correct anwer, it Kill 
be scored as correct. u 4. * 

10. Don't give extra hints or play favorites with people taking the test. 

11. Oetennine if the student's perfomance net the specified standard. 

12. PFC S-aith has just advanced from the introductory to the Intenaediate 
automotive repair course. He was not able to tune and engine 
completely at the start of the intermediate course— although he had 
done so In order to pass the introductory course. 

13. Although a student niechanic successfully passed the engine tuning 
section of an autotaotive CRT, he lost 1 tool, broke another, and 
gof grease all over the place. Is this aspect of his performance 
significant, although it was not expTicitly "tested" by any items 
of the actual test? 



E-7 



177 



14. If t sStient passes (a) 2, (b) 3, (c) 4 objectives on e CST with 
4 objectives, then he should be passed on the vfcole test. 

V 

Chapter 7 

!• •Reliability/ v^en talking about tests, means about ti^e sasre as 

(a) validity, (b) that the sane scores should obtain on a second^ 
ad^'ni strati on of the test to the ssme people, (c).that the test 
iDeasures what it's supposed to neasure, (d) standardization of training 
and testing comJiticn^. 

2. If vaJiidity i^ high, reliability will usually be (a) hiah, (b) low, 
(c) Qf3N\i be either high or low. 

♦ 

3. A test could be^very reliable but not very valid. True or false? 
Can you think of an exaii5)le to back up your answer? 

4. Hiqher fidelity test itenis nay help to increase fa) reliability, 

(b) validity, (c) both, (d) neither. 

5. Why should only a short tirie (like a couple of days) elapse whea 
conducting a test and ret^st reliability check? 

6* A class of 30 H.P. students took a test at lOOO on Hsnday, and were 
given the san»e test (because the instructor wanted to conduct a 
reliability check) on Tuesday a^ 1900. (1900 wis the only tine 
that he cou ld ge t all of the students together.) The results lere: 

'First Day 
Fail Pass 
Pass 2 17 

Second Day 

Fail 1 10 

r 

Compute the val^ of phi. What does this value suggest? 

7. Another Instructor decided to compare the results of his CRT given 
to the 28 students in his class with ratings of each student's 
performance as given by an expert -observer JCbe results were: 

CRT Results 
Fail Pass 
Pass 1 20 

Expert's Ratings 

Fail 5 2 

Cos^ute the value of phi. What this value suggest? 
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Bere's enotijer •tb?;^?!! <raestdon* that will help to prepare ytra for 
socis of the nore cbnplex uses of CSCts in operational sitastions. 
A Corps* of Engineers test produced the foil wing results: 

Fona A given ton. 

Fail Pass ^ 
Fonii A ^ Pass 5 22 * 

given on 

Wed. Fail 2 . 11 * 

VJhat is the value of phi, for t«t-retest reliability? Is it an 
acc^tafale value? 

The Instructor was not pleased with tSiis value of. f^i, and so he gave 
the saro class another fons of the test {Fon^ B)^ Fri. His aiis 
wis to coojpare the results from Fonp B with the results of Fomi A» 
as the latter was given on iton. and Med. The nevf data looked as 
follows: 

Form A on Won. 
Fail Pass ^ ^ 

Form B on Pass 1 . 35 • 

FHday - Fail 3 1 * 



Fonn A on Wed. 
Fail Pass , 

. Fons B on Pass 6 28 

Friday , Fail 2 " 5 * 

Wiat are the values of phi for thWe t»o tables? 
Now interpret the values of all three coefficients that you've cal-^ 
culated; that is, what do you tiiink the phi values for Forra A on Mon, 
vs* Fona B, andf Fons A on Ked. vs. Fona B oean? 
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A?esfc:^ to HTvitH ?>miB^s 



D^pter 1 ^ k 

1. False. 93tte^ J^ge 1-2. the ^nparunt di^^re^roes bet>¥een ^^TTs 
end CRTs are listed in Fig. 1-1. 

2. If the standard speci'^^ed in this probleir is used, then 10 students 
•will have to st^y for nore practice. This M en URT, because the 

tester chose a oassirtg standard or the basis of how veil a st:3dent 
performed relative to other students > Note that irith this kind of 
decision standard, only Uie top 205 of the students w^jld pass e*^en 
if (a) students had perfonred *p:>orly* (all had c*btained only 7 
cr less. direct Hts), (b) ell students had performed ^^ry well" 
(all obtdiried 15 or nc-re direct hits). 

3a. Distribution A is frorr an hPT/wnereas B and B* are frorr a CfiTT . 

3b. Score of ^--on the K?rr, only a small percentage of the studeats got 
this score or higher; on the CRT, m:>st of the people >ihorn> we nnQht 
label Vaster* got a score near SO. . ^ . 

Score of SO-^on the KHT, msre people got this score than any other 
score; whereas on tne D?T, no one got this nriddle score. 
Score of ^^n the NRT, only a small percentage of the students got 
this score or lower; v^ereas on the CfTT, nest of the people MtrJxr we 
iPiaht label as *non-nasters' oQt a score near X3. 
T>te KRT spreads people out on a distribi(tion of scores, so that very 
few students do really well on the test, and very few* do rea^|y ?oor}y. 
Host tesd-to cluster around the middle, or average. The CRt ideally ^ 
tries to spread people into two separate and non-overlapprr^ groups:' 
those who clearly passed the test, and those vrf^o clearly failed to 
pass it. (Kksters and non-masters, or distributions B and B'.) * 

3c. There may be several reasons for the differences in the shapes ^f the ^ 
curves. Consider differences in training* procedures . StudentsX 
described by curve A {the KR curve) may have been^rained in^ a group', 
and given the same amount of training before being tested.. Students 
described by curve B' iray have received individually prescribed in-, 
struction {each student learning at his OHp.pace), and then tested when 
. he felt prepared to take the test. 
Note that an KRT is designed to spread people out at the extrsne scores, 
so that very few people do 'really well, and very few people do readily 
poorly. Most people falT near the nriddle. A C8T Is designed so that 
• people who really have nastered the ra^tenal will do welU and those 

. who have not will do po&riy on the test. A CRT is not iised to assign 
grades to people, other ;than "pass-frSl.* If we as$ a CRT, we nust 
care rore about whether person X has mastered the iast than if person 
X got a better score l^han person Y. • 
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C€«sid6r, as a sicple e«niple, the "Xisk" of broad- jumping. If tue 
laaasura tew f«f each persOT cin jinrp, t^ien *e're using the distence 
c^stfreasnt as ah ^iSTT. As a result of these ceasirreneTts , ve'll feioir 
, .If person X ca!5 junp farther tht?) person Y, «rri *<e'll fc« iile to plot 
a distribution of. scores as In distribjtion A. Ifc»r sjppase t^tat we 
dig a 1.5 raster ditch, as the friniinuir criterion distance that a ser^of 
rust be to jump in order to pass the jumping test. a person 
can imp the ditch, veUl pass hie; if not- he'll fall if^, i.^ It will 
be obvio:.-s that he faileJ. This CPJ is pass-fail oriented, since 
■" we'r* nst iJiterested tn hw far each student juiTDed. Pather, ins Just 
xant to fcriow if esch stiident was able to jump acn?5s the ditch. 

«. b. 

5. c. 

6. a. 

Chapter 2 

1. Level Two. This is a very close approximatlOD {'high fidelit/') to 
t^ve 'real world' situaticn. 

2. J.evel One. T?jis is the "real world' situation, which is inpossible 

to totally duplicate in any kind of test setting. 

3. Leyel Three. The target used here is nwch core artificial than the 
outline of novir^ tank, which we Just described as- a Level Two objec-X 
tive. ■ Ir» general , Level -Three objectives must be passed before Level \ 

•Two objectives ^re tested. Obviously, a student cust learn how to \ 
load and fire an anti-tank round before be can even hope to hit tJ^e I 
center of -a stationary target. ' ' , i / 

What level objective would this learning process be? Also a level / 
Three. Piecemeal assessment of a subcompon&Jt of the actual desired / 
behavior- in ^n artificial setting constitutes a Leve? Three Olyectiv^ 
So this example actually involved only two Level Three objectives: 
naking sure that the weapon can be loaded and fired correctly, and then 
testing the student's accuracy of firing at an "artificial" target. 

4 True, for ex^le, a student at the end of 6 training sequence sirould 
not need the broad hints that you gave~Fiiii during the earlier phases 
of training.. Thus, early in an electronics course the test conditions 
ntight specify the specific components or instrurj^ts to fee used In 
trouble-shooting rai functioning equipcent. 

5. 1-a. 2-c. 3-c. 4-b. S^a. 6-b. 

6:- a, d. . ' • , 
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7. Kslfi intent: Identify or r&c&gnize the spark plug *fre7tc*i. Indicator: 
drjllng the picture of tbB wreith. Altematfw Indicators: Pointing 
cat Xbe picture, or pUc1r*g a dnecl rark by tte pictsa-e. 

■8. stsKi^t has to first choose the ^p?^C'pr1ate shears, ind then ase 
tftem properly In oprder to cut a six Inch circle. Sp-ths first rain 
i&tent Is the actaial choice of tlse correct tooU^e second {sM 
pertiaps rore Inportant) train Intsftt Is thg^estl^t ase of tiie tool In 
ctfttlog the ^eet netal. 

9. Overt {th1r/of 'open* 3 rain intents specify the required perfornance, 
ten to v&sure it, iind <io not reqiKre indicator responses: Covert 
{ihiTA of • covered" )-na in ftitents do not alld* us to directly measure 
iAe desired perfonrience. For exanple, en a.nti -aircraft test iri§ht 
require tiie gunnery crar to distii?gyish between the oytlines of frigidly 
vs. hostile, planes. One way to conduct the test »ou1d-*«e to have ginnery 
students drgy pict^n-es of PhantaxES, KISs, etc. A siinpler and better 
indicator'l^d be to give black profiles of all sudi aira^ft, and 

have the stsid&it ir^dlcate <by circling, placing a chectatark, etc.) 
whether each craft is friendly or hostile. « 

10. Perforn^-nces should be' stated by specific action verfe. Conditions 
and standards will not be adec?uate if you have to supply any additional 
infonnation. You sbouid not teve to inteniret of figyre out *^5at 

Is roeant by the conditions and standards of statements if they are 
operational ly»>def1r»ffd. 

11. True. ?jeciU tbat'a Level One objective refers to actual objective 
in meaningful units of work activity in operational ^vi rorensits ; 
"dh-the-iob-perfonnance,. " , ^ ^ 
On the otheK band. Level Three objectives jnclude enabling skills and 

leanrfng el^nients. A person insist be able to perfona these in order - 

to correctly perforra Level Two an* One objectives. As an exanjile, 
a Level One conditions statement rcight be: "^iven a nal functioning 
generator..." Tnis would be appropria^ for testing an advanced _ 
electrical technician, but not for o^e who had jOst completed the 
beginning course. The nore appropriate conditions state-rent for the 
novice student^should include vore specific inforration {"helpful 
hints*), such as: "Given as 45 KH generator wjth a broken shaft 
- bearing. . . " Tliis woul d then be 4 tevet Twa W ^n T4»r6&> <^it1cns 
stateDent. * 

This exanrple shows that improperly specified conditions at one level 
of objective nay indeed be properly specified at another level . 
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12- Consider h» the ifistructcr cojld increase the structure and specif- 
icity of testing, etow? fty settl'^g %3Hous objectives: Perfeniznce 
(hindlinp tbe sroper controls In the right seqjertce), Ctondltlons 
(executing different nsriejvers, flyirtg jirfth or against the irind, wlt^ 
and without e rajple of tons of deed >«1cht). a^^d Stinddrds <1a5ding 
on a givert target, iiaking a *soft* landing^ etc.). He should have a 
ciieckUst of these ttaxiy objectives rade up before testing the trainee, 
so that he wor^'t have to re^y on his owm intuitive evaluation and 
renory for vi^at the entire set of scores i^as. 
The instructor would want to record such data as: errors that tbe 
student nade in carrying orft various iraneuvers, sta^fent's response 
tires ang h^itat^on, whether the stodent's response bro-^ht the crafty 
vlth'^r: t^e ranpe of the a^>propriate standard (did he fly on course, 
d*'d ^te land or* target, etc?}. 

Chapter 3 

1. False, H^gner fidelity iteins are ncre realistic and re5ii1re *h3:nds- 
on* performance. 

2. Ko. This IS only a paper and pencil test. You should have the trainees 
per^onTi some of the behaviors that they will be required to perfom on 
the job. Setting only 4-0 out of SO questions correct also seeirs to 

be a ratner lax standard, especially in a critical area lilte tnedical 
training. Incomplete or inoer^ect knowledge could. result in needless 
sufferir^p or even deatH. 

3. This is better, because it is now a simulated ^'handsron" oerfornance 
test, towever, only 3C test ^ts?5 (out of the 40 ii^juries which had 
been covered m the course) have been chosen fron the 40 cases studied 
in the course. And only 25 of the SO iten!S need to be passed. So this 
less-than-full coverage also seems to be a rather lax standard. 

^. This is a better test. Assunina that the iters were reliable and valid 
(see chapters 5 and 7). the only obvious way to inprove the test would 
be to increase the nutter of item. This would cover nore varietions 
of the original 40 types of injuries. If there were practical con- 
straints ai proposed, you miqht then want to randomly divide the class 
into two croups of 25 students each. Then randomly divide the 40 test 
item 4nto two groups of 20 each. Thus, each student w^yld get only 
20 problens, but he would not know M^ich 20 teforehand. He would have 
to do all 20 correctly. 

5. c,e. All of the other choices in this answer could be •rachlne-scored." 
Be aware that so^etines nore than one answer can be correct in fill-in- 
the blank itens. Both this type of an iten, and essay questions require 
judgment by the scorer. 

* « • • 

♦ 
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6/ The Instnictcr nrt<?ht be tesipted to g^va Ms own stodents s1lc>tt1y 
higher rarts ji^st to E^^lirself loDk qcoU. 

1-b. 2^- (tely the settings are neasured— w llvefirs Is osed.) 

8- Yea iiay tave to cat <fcn«^ O!^ the scojnt of s^/pplles used 1r>'the test: 
- ftjel, as3mm1t1oa, etc. » because of excessive cost. You cay havB to * 
£0»±;ct the test for a shorter tii;e length then j^a'd lite to, becaiise 

of: laree rrjinbers of studantsl snail number of jsidges, liirited 

availability of test site. 



9. c, d- 

10. SuDpose that ^ the subject falls anter one or -rore of the difficult 
conditions- ^Was it becai^e he couldn't do the task at all ,^or because 
a condition was just too difficuU? K you have one easy condition, 
and the subject passes that phase of the test, you'll at least know 
th^t be can do the task, although perhaps mt binder al^ conditions pf 
difficulty. ^ 

11. fto- He's letting his own subjective feelings and perhaps personal 
dislike bias his interpretation of the scor^for Jones. *It is 
never proper to add test items defflf^g a test idnrinistration {p^ 3-31).' 

12. Five. Each of the ^diaronds'^yTequi res thatr4 yes-no decision be irade 
at that point. 

Chapter 4 

1. The colunm headings in Fie. 3*11 indicate the specific guidelines 
lAlch are explained in more detail on p. 4-2. In actual practice. 
It nay often be easiest if you first of aTl iiake up a test itepj 
fron your own assessnent of the jguidelines, and then check it against 
the specifications listed in 3-11, That is, after you We 
created a test ite^ and specified the perfonrance, conditions, and 
standards, all you have left to do Is fill in the colums of the 
worksheet. 



ftote that on p- 4-6, hints are acceptable. Furthervor^^ the 
guidelines on p. 4-7 suggest that as a general rule, specif k 
Instructions should be supplied to the student. Pands-^n 
perforrance itens should have perforrance^ conditions, and ^ 
standards explicitly stated in operational terns. 

d. Perfornance, conditions, and standards twst ratdi In the 
objective and In the test Iteci* ievel of fidelity, by itself, does 
not rake an Iten good or bad. And ail *(rf>ject1ve ray trave an- overt 
rain Intent or require an indicator response.^ 
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2. The fon-rasters grou^ mst he composed of people k^.o have fet-^g 
mioiinal rei^uirenents for entering the course. Tney should be an^ 
dctial S23!p}e of, or at least r>epres$r>^ r^g^feople w^,o win be takints 
the co-jrse. Ihi^k of ho¥ absurd it woaSd be to tzse as the Don-rasters 
a group cf secretaries, slitply because ihone of thw, had ever done 
anything sir! Tar to wr^t the test was all about (such as dlsassenfcHng 
and cleanlr^g an H-i6)! Seca^^e none of thw will ever <3o it, people * 
frcKTi this secretarial group carmot be ased as your gr^sp of non^iiasters . 

2. 3/2 X ^0 =-60 people altooether. Half should be irasters {SO), and 
half should be n^-nasters (30). 

'to FiDT let the number of available masters ar^ non-nasters in the 
tryout population dictate the njwt^er of itens on your test. You'' 
HirST get enough people to test out t^e ruinber of itens you feel "are 
necessary. 

3- J^on-Kasters ^testers 

Pass 6 26 

Fail . iL 4 

Note that 10 people {16.72} were incorrectly classified. Yes, this 
ites? seeiT!S to discriirinate between masters and non-rasters fairly 
wel 1 . 

^- Non-Kasters Masters 

4 

Pass 13 18 

^ail .17. . 12 • 

There is a 50- SO chance of getting this itenj correct just isy cursing, 
so you'd expect about 15 people out of 30 to get it right, by "chance 
alone. And indeed, 1& of, the rasters got it right, and 13 of the 
nonniasters got it right. Since only 3 nore nasters got it richt 

.than would be expected by chance, the it©n inust be so difficult that 

\5t should be discarded. 

5. Since so rany non-rasters got this ite^i correct, the it^ shduld 
be crritted. It just didn't separate the rasters fron the non- 
rasters. 
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13. Yfcs. Although the product &ctutUy doing food r^lr work (so 
that Hit engine noald indeed run snoothly), the process by which he 
achieved that product should also be noted by the exanrfner. And part 
of the process Includes the trainee's careless behavior. 

It's possible that the staident could use soiae resaedlal practice In 
how he ^4oes repair nfori:, even thot^gh he Is able to perfons the actual 
^Fing arid r^>a1rs successfully. 

14. c. The trainee rust pass the nrfn^ral nunfeer of Itans for each 
objective. You can\t just add up the total nuniber of itens passed 
across all objectiv^, and then see if that value exceeds the criterion 
value for the overall test, tether, each objective i:»st be passed at 
sone tginical level in order for the ><ho1e test tp be passed. 



1. b. Think of reliability as the repeatability of test scores. Choices 
' a ind c refer to valitffty-Hioes the test measure v*at it Is supposed 

to oeasure? Choice d ray help to Increase reliability^ but Is not 
the correct answer here because it could r^fer to other things besides 
reMability, 

2. a. If the test is really lueasurlng what It's supposed to neasure, 
then you should get about- the saiae results when conducting a test- 
retest reliability check. Of course, external conditions and personal 
variables could decrease the reliability of the test results^ as 
could confusion anong judges about scoring procedures. 

_3. True. To'take an oversimplified exasiple, suppose that^you titought * 
that a baksball player'^Dattlnq ability could be measured by (or 
predlcted^y, or vras related to) his ttfttrtdng ability. Certainly 
. the caxlinixa distance that he can throw a baseball vrlll be a rather 
re^4abU xaeasure over cany such throwing trials. But the distance 
that be can throw a bal] Is not a valid measure (may not be highly 
correlated with) of his batting ability. 

4. c. Validity wiTl be increased because the test 1s a- closer approxi- 
ration to the "real thing." And higher fitlellty means that Irrelevant 
factors which nrfght otherwls? infl^ience the perforoance of the test 
taker are reduced. Therefore, repeated perfonsances should be more 
consistent. And the more consistent the perfomance, the higher the 
reliability. 
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5. People ^rget things ever a period of ttne. And, saroe things 
that pKjple Team since taking a test ray interfere with the 
knwrle^e or skill that had been previously learned to pass the 
test. 

6. phi « 1 X 17 - 10 X 2 = _;3 = -.02 



••■19 X 11 X 3 X 27 -^209 x 81 



Either conditions or persoral variables (or bott) were trrdesirable 
on the second day. Actually, the trainees were probably just too 
tired and pwrly irotivated to be taking a test at 1900. 

7. phi = 20 X 5 - 1 X 2 = 100-2 = +.70 



••21 X 7 X 22 X 6 •• 19,40d 
. Jiere-see^ts to be rather high concurrent validity. 
8. phi = 22 X 2 - 5 X 11 = -.03 



V 27 X 22 X 33 X 7 
phi " 3 x $5 - 1 X 11 = -^.72 



N'36 X f X 36 X 4 
phi = '28 X 2 - 6 X 5 ' +.10 i 



V 34 X 7 X 33 X 8 



The first value of phi, -.03, is so low that there is very poor 

reliability for Forra A test-retest reliability. 

Examining the second and third phi coefficients, we nay note 

that the Fom A results frm Hond^y correlate very highly vrith the 

Fom B results frosi Friday. However, the Form A results froa 

Wed. correlate very poorly with Form B results frora Fri. What is 

the tester able to infer fron all of this? 

Well , soraething was probably quite unfavorable when Forti A was 

given on Wed. Perhaps conditions or personal variables were 

adverse. , 

It therefore seers that Forra A is reliable, Fom B is also reliable, 

and that we can discriss the results of Wed. as arising from adverse 
conditions external to the test. 



187 



ERIC 



E-17 



